Mike Anderson | 1 Oct 2002 01:28
Picon
Favicon

Re: [PATCH] first cut at fixing unable to requeue with no outstanding commands

James Bottomley [James.Bottomley <at> steeleye.com] wrote:
> +		if(SHpnt->host_busy == 0 && SHpnt->host_blocked) {
> +			/* unblock after host_blocked iterates to zero */
> +			if(--SHpnt-≥host_blocked == 0) {
> +				printk("scsi%d unblocking host at zero depth\n", SHpnt->host_no);
> +			} else {
> +				blk_plug_device(q);
> +				break;
> +			}
> +		}
> +				

Are we guaranteed that blk_run_queues will be called for all types of
I/O?

-andmike
--
Michael Anderson
andmike <at> us.ibm.com

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Doug Ledford | 1 Oct 2002 01:49
Picon
Favicon

Re: Warning - running *really* short on DMA buffers while doingfiletransfers

On Fri, Sep 27, 2002 at 05:23:10PM -0400, James Bottomley wrote:
> mjacob <at> feral.com said:
> > Duh. There had been race conditions in the past which caused all of us
> > HBA writers to in fact start swalloing things like QFULL and
> > maintaining internal queues. 
> 
> That was true of 2.2, 2.3 (and I think early 2.4) but it isn't true of late 
> 2.4 and 2.5

Oh, it's true of current 2.4 (as of 2.4.19).  It's broken for new and old 
eh drivers both in 2.4.  Hell, it's still broken for new eh drivers in 2.5 
as well.

--

-- 
  Doug Ledford <dledford <at> redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Doug Ledford | 1 Oct 2002 01:54
Picon
Favicon

Re: Warning - running *really* short on DMA buffers while doingfiletransfers

On Fri, Sep 27, 2002 at 03:28:47PM -0600, Justin T. Gibbs wrote:
> > Linux is perfectly happy just to have you return 1 in queuecommand if the 
> > device won't accept the tag.  The can_queue parameter represents the
> > maximum  number of outstanding commands the mid-layer will ever send.
> > The mid-layer is  happy to re-queue I/O below this limit if it cannot be
> > accepted by the drive.   In fact, that's more or less what queue plugging
> > is about.
> > 
> > The only problem occurs if you return 1 from queuecommand with no other 
> > outstanding I/O for the device.
> > 
> > There should be no reason in 2.5 for a driver to have to implement an
> > internal  queue.
> 
> Did this really get fixed in 2.5?  The internal queuing was completely
> broken in 2.4.  Some of the known breakages were:
> 
> 1) Device returns queue full with no outstanding commands from us
>    (usually occurs in multi-initiator environments).

This may be fixed.

> 2) No delay after busy status so devices that will continually
>    report BUSY if you hammer them with commands never come ready.

This is still broken.  Plus, it has a limited number of retries before it 
simply returns an I/O error, so it basically hammers the device (so it 
can't get unbusy) until a set number of retries have completed then it 
returns an I/O error, giving all sorts of false I/O errors on devices that 
use BUSY status.
(Continue reading)

Doug Ledford | 1 Oct 2002 02:07
Picon
Favicon

Re: Warning - running *really* short on DMA buffers while doingfiletransfers

On Fri, Sep 27, 2002 at 03:08:26PM -0700, Mike Anderson wrote:
> I thought there was discussion previously on mid-layer queue
> adjustments during the (? attach patch ?) but I am having trouble
> finding it.

That's because I didn't release it (hell, now I'm having a hard time 
finding it).

--

-- 
  Doug Ledford <dledford <at> redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Doug Ledford | 1 Oct 2002 02:13
Picon
Favicon

Re: Warning - running *really* short on DMA buffers while doingfiletransfers

On Mon, Sep 30, 2002 at 08:07:11PM -0400, Doug Ledford wrote:
> On Fri, Sep 27, 2002 at 03:08:26PM -0700, Mike Anderson wrote:
> > I thought there was discussion previously on mid-layer queue
> > adjustments during the (? attach patch ?) but I am having trouble
> > finding it.
> 
> That's because I didn't release it (hell, now I'm having a hard time 
> finding it).

Whew!!  Found it.  I'll update it to a current kernel and send it out so 
people can see what I did.

--

-- 
  Doug Ledford <dledford <at> redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

James Bottomley | 1 Oct 2002 02:38
Favicon

Re: [PATCH] first cut at fixing unable to requeue with no outstanding commands

andmike <at> us.ibm.com said:
> Are we guaranteed that blk_run_queues will be called for all types of
> I/O? 

Pretty much, it looks like.  It's mainly triggered by I/O stuff in fs, 
including the buffer functions, so I think the assumption that we will be 
called eventually is good.

James

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patrick Mansfield | 1 Oct 2002 17:01
Picon
Favicon

Re: [PATCH] first cut at fixing unable to requeue with no outstanding commands

On Mon, Sep 30, 2002 at 08:38:49PM -0400, James Bottomley wrote:
> andmike <at> us.ibm.com said:
> > Are we guaranteed that blk_run_queues will be called for all types of
> > I/O? 
> 
> Pretty much, it looks like.  It's mainly triggered by I/O stuff in fs, 
> including the buffer functions, so I think the assumption that we will be 
> called eventually is good.
> 
> James

What about applications and devices that only send one IO at a time?
They could still hang.

We have: tape (st or osst), sg usage, partitioning, scanning, direct
use of the block device (i.e. dd if=/dev/sda), and probably others.

-- Patrick Mansfield
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

James Bottomley | 1 Oct 2002 17:14
Favicon

Re: [PATCH] first cut at fixing unable to requeue with no outstanding commands

patmans <at> us.ibm.com said:
> What about applications and devices that only send one IO at a time?
> They could still hang.

> We have: tape (st or osst), sg usage, partitioning, scanning, direct
> use of the block device (i.e. dd if=/dev/sda), and probably others. 

I don't believe so.

Unplugging is a global thing.  The request function for a plugged queue will 
always be run, that's a guarantee, so using this approach, the queue will 
stall but never hang forever.

I think for a rejection of a command with none outstanding, a stall is what 
you want (give the host/device time to recover from whatever the problem is).  
The length of the stall will be dependent on I/O pressure in the system, which 
is also roughly what you want: we can afford to give a device quite a while to 
recover if we're not desparate to get I/O to it. We can also handle BUSY 
returns this way too...

James

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mike Anderson | 1 Oct 2002 18:23
Picon
Favicon

Re: [PATCH] first cut at fixing unable to requeue with no outstanding commands

James Bottomley [James.Bottomley <at> steeleye.com] wrote:
> patmans <at> us.ibm.com said:
> > What about applications and devices that only send one IO at a time?
> > They could still hang.
> 
> > We have: tape (st or osst), sg usage, partitioning, scanning, direct
> > use of the block device (i.e. dd if=/dev/sda), and probably others. 
> 
> I don't believe so.
> 
> Unplugging is a global thing.  The request function for a plugged queue will 
> always be run, that's a guarantee, so using this approach, the queue will 
> stall but never hang forever.
> 
> I think for a rejection of a command with none outstanding, a stall is what 
> you want (give the host/device time to recover from whatever the problem is).  
> The length of the stall will be dependent on I/O pressure in the system, which 
> is also roughly what you want: we can afford to give a device quite a while to 
> recover if we're not desparate to get I/O to it. We can also handle BUSY 
> returns this way too...
> 
> James

ok I see the call path now. I was unsure that blk_run_queues would be
called for non-fs IO.

I traced a dd (dd if=/dev/sda of=/dev/null count=2 bs=512) command under
uml with a modified scsi_debug to return 1 on a queuecommand call and
your patch. The trace showed blk_run_queues being called through
wb_kupdate.
(Continue reading)

James Bottomley | 1 Oct 2002 18:30
Favicon

Re: [PATCH] first cut at fixing unable to requeue with no outstanding commands

andmike <at> us.ibm.com said:
> 	- Is the call to scsi_delete_timer really necessary. All callers
> 	  have already deleted the timer. The del_timer function takes a
> 	  lock which would be nice to avoid if we do not need to call
> 	  it.  If we want to protect the code we could do a quick check
> 	  on SCset->eh_timeout.function prior to calling. 

That's a belt and braces thing.  I suppose we could change it to BUG_ON timer 
active and see what pops out of the woodwork.

andmike <at> us.ibm.com said:
> 	- Patrick pointed out a while ago that the "if (host->host_busy
> 	  == 0)" check and the similar one for the device will never be
> 	  called because to be in this function the values of these
> 	  variables need to be at least 1. I believe this direct call to
> 	  scsi_retry_command should be removed instead of adjusting the
> 	  check to "== 1" as this seems counter to how you are trying to
> 	  handle busy. 

Yes, I plan on slowly removing all automatic reissues of commands. I've 
already removed the checks in my local tree (and the corresponding reissues).

James

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(Continue reading)


Gmane