Samuel Ortiz | 1 May 02:20 2007

Re: [PATCH] fix a race between tx and rx paths

On Fri, Apr 27, 2007 at 09:43:39AM +0200, Guennadi Liakhovetski wrote:
> On Fri, 27 Apr 2007, Samuel Ortiz wrote:
> > After looking a bit more carefully at it, I have a remark regarding this
> > patch:
> > 
> > > > diff -p -u -r1.1.1.8.4.1 irlap_frame.c
> > > > --- a/net/irda/irlap_frame.c	2 Apr 2007 10:36:28 -0000	1.1.1.8.4.1
> > > > +++ b/net/irda/irlap_frame.c	19 Apr 2007 13:20:36 -0000
> > > >  <at>  <at>  -796,6 +796,7  <at>  <at>  void irlap_send_data_primary_poll(struct
> > > >  		self->vs = (self->vs + 1) % 8;
> > > >  		self->ack_required = FALSE;
> > > >  
> > > > +		irlap_next_state(self, LAP_NRM_P);
> > > >  		irlap_send_i_frame(self, tx_skb, CMD_FRAME);
> > > >  	} else {
> > > >  		IRDA_DEBUG(4, "%s(), sending unreliable frame\n", __FUNCTION__);
> > > >  <at>  <at>  -808,6 +809,7  <at>  <at>  void irlap_send_data_primary_poll(struct
> > > >  			skb->data[1] |= PF_BIT;
> > > >  			irlap_send_ui_frame(self, skb_get(skb), self->caddr, CMD_FRAME);
> > > >  		}
> > > > +		irlap_next_state(self, LAP_NRM_P);
> > > >  	}
> > Here you're calling irlap_next_state() after the last frame has been sent.
> > If irlap_send_data_primary_poll() gets preempted, and we receive the peer's
> > answer before we call irlap_next_state(), then we may hit the same bug you are
> > trying to fix.
> 
> Yes, I know. It was a deliberate decision. I couldn't reproduce a race in 
> those branches, and, therefore couldn't test any fix. So, I preferred to 
> go the "least impact" way.
(Continue reading)

Samuel Ortiz | 1 May 02:53 2007

Re: [irda-users] [BUG] 2.6.20.1-rt8 irnet + pppd recursive spinlock...

On Mon, Apr 30, 2007 at 03:24:05PM +0200, Guennadi Liakhovetski wrote:
> On Tue, 10 Apr 2007, Samuel Ortiz wrote:
> 
> > Hi Guennadi,
> > 
> > The patch below schedules irnet_flow_indication() asynchronously. Could
> > you please give it a try (it builds, but I couldn't test it...) ? :
> 
> Ok, your patch (still below) works too (now that I fixed that state 
> machine race, btw, we still have to decide on the final form how it goes 
> in the mainline) __after__ you also add the line
> 
> +	INIT_WORK(&new->irnet_flow_work, irttp_flow_restart);
> 
> in irttp_dup() (remember spinlock_init()?:-)), otherwise it oopses.
good catch, again...Yes, I do remember the irttp_dup bug ;-)

 
> Generally, I like your patch better than mine to ppp_generic.c, where I 
> explicitly check if a recursion is occurring. Still, I am a bit concerned 
> about introducing yet another execution context into irda... We have seen 
> a couple of locking issues there already in the last 2-3 months especially 
> under rt-preempt... Would you be able to run some tests too? 
I think I can run some tests here as well, but probably not as many as you:
I'm not doing IrDA stuff full time while it seems you currently are.
But I will definitely spend some time this week running my IrDA stack with this
patch applied. I didn't bother to do that earlier as you first reported some
oops with this patch applied.

> I will be 
(Continue reading)

Guennadi Liakhovetski | 2 May 11:53 2007
Picon

Re: [PATCH] fix a race between tx and rx paths

On Tue, 1 May 2007, Samuel Ortiz wrote:

> The correct behviour is to move to the LAP_NRM_P state.
> All the frame from the current TX window are queued in the wx_list. After
> sending all our frames from the current window, the peer will acknowledge the
> last frame it received. We can then update our wx_list and we're left with an
> empty list if the last frame was ACKed. If not, all the remaining frames are
> resent (see irlap_update_nr_received() and irlap_resend_rejected_frame()).
> 
> So, that let me believe that we should probably set our state to LAP_NRM_P just
> before calling irlap_send_data_primary_poll(). In any case, TX failure or not,
> we must be in NRM after this call, otherwise we will discard the peer response
> and _then_ we're stuck.

Hm, if we don't send the last frame with the pf bit set, e.g., because 
skb_clone() had failed, the peer will not reply, because we are the 
primary station. The next event that will come is POLL_TIMEOUT_EXPIRED. 
Then, if we switch to NRM_P this event will not be processed, it is only 
processed in XMIT_P. Looks like NRM_P must only be set after we've 
successfully sent a pf bit and before we get a response.

Thanks
Guennadi
---------------------------------
Guennadi Liakhovetski, Ph.D.
DSA Daten- und Systemtechnik GmbH
Pascalstr. 28
D-52076 Aachen
Germany

(Continue reading)

Guennadi Liakhovetski | 2 May 13:05 2007
Picon

Re: [irda-users] [BUG] 2.6.20.1-rt8 irnet + pppd recursive spinlock...

On Tue, 1 May 2007, Samuel Ortiz wrote:
> But I will definitely spend some time this week running my IrDA stack 
> with this patch applied. I didn't bother to do that earlier as you first 
> reported some oops with this patch applied.

I think, what I reported was not an Oops, but the race that we're also 
fixing ATM - the one in the state machine. So, unrelated.

> On Mon, Apr 30, 2007 at 03:24:05PM +0200, Guennadi Liakhovetski wrote:

> > in irttp_dup() (remember spinlock_init()?:-)), otherwise it oopses.
> good catch, again...Yes, I do remember the irttp_dup bug ;-)

I've put a tsap_init() function that does those common initialisation 
calls, so we have a smaller chance of forgetting the dup() path next 
time...

> > I will be 
> > testing it too, but don't know how much longer and how intensively. Do you 
> > think we might get some new problems with this new context?
> It seems quite safe to me. But more importantly, I thing we do want to call
> the flow indication routine asynchronously in that specific case. 
> There is one thing that this patch is missing though: we should probably
> clean the worqueue right before we destroy the TTP instance. The work routine
> checks for NULL pointer, but still...

I thought about it too, but the only thing you can do is 
flush_scheduled_work(), is this what you mean?

The test with your patch stopped inside a ftp transfer - the ftp client 
(Continue reading)

Guennadi Liakhovetski | 3 May 10:12 2007
Picon

ppp_generic.c recursive spinlock_bh(&pch->downl)

Hi Paul,

This has been discussed on these lists since some time already, but as we 
don't seem to come to a proper fix, I decided to summarize our problem 
once again.

A short summary from the ppp PoV:

ppp_push()
{
	spin_lock_bh(&pch->downl);
	pch->chan->ops->start_xmit(pch->chan, skb)
	{
		...
		ppp_output_wakeup()
		{
			ppp_channel_push()
			{
				spin_lock_bh(&pch->downl);
BANG!...

This happens with IrNET. So, the question is, is it legitimate to call 
ppp_channel_push() from the .start_xmit() method? How do other ppp drivers 
do it? Looks like, for examplke, ppp_async checks explicitly for 
recursion, synctty does a try_spinlock_bh()... Is it just a bug in IrNET 
and one just cannot do this, or is there a simple fix for this?

Samuel's suggestion is to offload the call to ppp_output_wakeup() to the 
keventd workqueue, but we haven't been able to make this work properly so 
far.
(Continue reading)

Alex Villací­s Lasso | 3 May 19:48 2007
Picon

New experimental driver for USB IrDA dongle, looking for testers

I have this USB IrDA dongle, for which no Linux driver currently exists 
in mainline. When plugged in, lsusb reports the following:

Bus 002 Device 002: ID 07c0:4200 Code Mercenaries Hard- und Software GmbH
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               1.10
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0         8
  idVendor           0x07c0 Code Mercenaries Hard- und Software GmbH
  idProduct          0x4200
  bcdDevice            0.00
  iManufacturer           1 KingSun Co,Ltd
  iProduct                2 USB to Irda
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           41
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          4 Usb to Irda
    bmAttributes         0x80
    MaxPower              100mA
    Interface Descriptor:
      bLength                 9
(Continue reading)

Samuel Ortiz | 7 May 01:27 2007

Re: [PATCH] fix a race between tx and rx paths

On Wed, May 02, 2007 at 11:53:54AM +0200, Guennadi Liakhovetski wrote:
> On Tue, 1 May 2007, Samuel Ortiz wrote:
> 
> > The correct behviour is to move to the LAP_NRM_P state.
> > All the frame from the current TX window are queued in the wx_list. After
> > sending all our frames from the current window, the peer will acknowledge the
> > last frame it received. We can then update our wx_list and we're left with an
> > empty list if the last frame was ACKed. If not, all the remaining frames are
> > resent (see irlap_update_nr_received() and irlap_resend_rejected_frame()).
> > 
> > So, that let me believe that we should probably set our state to LAP_NRM_P just
> > before calling irlap_send_data_primary_poll(). In any case, TX failure or not,
> > we must be in NRM after this call, otherwise we will discard the peer response
> > and _then_ we're stuck.
> 
> Hm, if we don't send the last frame with the pf bit set, e.g., because 
> skb_clone() had failed, the peer will not reply, because we are the 
> primary station. 
Yes, that's right. My reasoning was right only for frames with the P/F bit
_not_ set :-/ I guess that's what I deserve for replying to emails at 2am...

> The next event that will come is POLL_TIMEOUT_EXPIRED. 
> Then, if we switch to NRM_P this event will not be processed, it is only 
> processed in XMIT_P. Looks like NRM_P must only be set after we've 
> successfully sent a pf bit and before we get a response.
Yes, I agree. However, the current code doesn't let us know if 
irlap_send_i_frame(), irlap_send_rr_frame() or irlap_send_ui_frame() succeded.
Those routines can fail silently and we currently still set the stack to
NRM_P.
With the below patch, we won't fix that issue (going to NRM_P mode even if
(Continue reading)

Alex Villací­s Lasso | 7 May 18:19 2007
Picon

[PATCH] IrDA: Add support for KingSun/DonShine USB IrDA dongle

This patch adds support for the KingSun/DonShine USB IrDA dongle.

This dongle does not follow the usb-irda specification, so it needs its 
own special driver. In addition, it uses interrupt endpoints instead of 
bulk ones as the rest of USB IrDA dongles supported by Linux (just to be 
different?) and data reads need to be parsed to extract the valid bytes 
before being unwrapped (details in the comment at the start of the 
source). No speed commands have been discovered for this dongle, and I 
suspect it does not have any at all.

On plugin, this dongle reports vendor and device IDs: 0x07c0:0x4200 .

The Windows driver that is used normally to control this dongle has a 
filename of DSIR620.SYS .

Signed-off-by: Alex Villacís Lasso <a_villacis@...>

--

-- 

perl -e '$x = 2.4; print sprintf("%.0f + %.0f = %.0f\n", $x, $x, $x + $x);'

Attachment (kingsun-irda.diff): text/x-patch, 21 KiB
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
(Continue reading)

Samuel Ortiz | 8 May 01:12 2007

Re: [PATCH] IrDA: Add support for KingSun/DonShine USB IrDA dongle

Hi Alex,

The code looks quite good to me. Some minor comments though:

On Mon, May 07, 2007 at 11:19:17AM -0500, Alex Villacis Lasso wrote:
> +static int qos_mtt_bits = 0x07 /* > 1ms */ ;
> +module_param(qos_mtt_bits, int, 0);
> +MODULE_PARM_DESC(qos_mtt_bits, "Minimum Turn Time");
Why is that module parameter needed ? Did you have to tune it in order to get
the driver working ? If so, is the default value ane empiric one, or is it
extracted from e.g. what the windows driver send during link negociation ?

> +	/* Calculate how much data can be transmitted in this urb */
> +	usb_fill_int_urb(kingsun->tx_urb, kingsun->usbdev,
> +		usb_sndintpipe(kingsun->usbdev, kingsun->ep_out),
> +		kingsun->out_buf, wraplen, kingsun_send_irq,
> +		kingsun, 1);
> +
> +	if ((ret = usb_submit_urb(kingsun->tx_urb, GFP_ATOMIC))) {
> +		IRDA_ERROR("kingsun_hard_xmit: failed tx_urb submit: %d", ret);
If you decided to use the USB debug routines, then this should be an err()
rather than an IRDA_ERROR().

> +/* Receive callback function */
> +static void kingsun_rcv_irq(struct urb *urb)
> +{
> +	struct kingsun_cb *kingsun = urb->context;
> +	int ret;
> +
> +	/* in process of stopping, just drop data */
(Continue reading)

andrzej zaborowski | 8 May 01:41 2007
Picon

Re: [PATCH] IrDA: Add support for KingSun/DonShine USB IrDA dongle

Hi Alex,

On 07/05/07, Alex Villací­s Lasso <avillaci <at> ceibo.fiec.espol.edu.ec> wrote:
> This patch adds support for the KingSun/DonShine USB IrDA dongle.
[...]
> On plugin, this dongle reports vendor and device IDs: 0x07c0:0x4200 .

I don't have this model but I have two different dongles with Kingsun
Semiconductor chips.
Their IDs are 07d0:4100 and 07d0:4959. The former one reports
*exactly* the same properties (device descriptor, configuration
descriptor, endpoints configuration) as yours except for the
Vendor/Product ID and the usb strings (
Vendor: Dazzle
Manufacturer: Kingsun Semiconductor
Product: USB to Serial
) and I'm pretty sure the protocol it uses is also the same, so it may
be a good idea to add 07d0:4100 to your driver's USB ID table. I
tested the previous version of your patch but I couldn't get any
communication going on (neither Tx nor Rx) and the symptoms were
exactly the same as with MsWindows driver running under QEMU, so I
suspect the device is simply broken, but the driver is correct for
this device.

The second dongle (0x07d0:4959) is even funnier because there's only
one Interrupt IN EP which however is unused - all transfers go through
EP 0. For this one I do know how to set the baud rate and other
properties, it's done with an 8 byte packet sent to EP0, with
bRequestType set to 0x21, bRequest set to 0x09, wValue set to 0x0200
and wIndex set to 0x0001. The contents of the packet are as follows:
(Continue reading)


Gmane