Re: SCSI Kernel Problem - BAD

Gerard Roudier (groudier@iplus.fr)
Thu, 14 Mar 1996 22:58:05 +0000 (GMT)


On Wed, 13 Mar 1996, Michael Neuffer wrote:

> On Tue, 12 Mar 1996, Eric Youngdale wrote:
>
> > I haven't been paying close attention - is it only the NCR and
> > Adaptec 2xxx series drivers that are showing this problem?
>
> No, it seems that all drivers are having this problem.
>

Here is the mail I have sent to Eric@aib.com (cc: Drew@poohsticks.org) on the
3rd of March 1996 (Subject: tagged queue and Linux).

Eric,

Here is the modification of allocate_device() that I propose to you in order
to be optimal for scsi drivers with tagged command queuing support.
I have made lots of tests under 1.2.13 without any problem at all.
There is no race conditions since the tests of the busy state of SCwait
control block and the call to sleep_on() are done with interrupts disabled.
I have made some successfull tests under 1.3.69 too ((request.dev >= 0)
becomes (request.rq_status != RQ_INACTIVE)).
We can add a test of "intr_count" before calling sleep_on() of you think
that it is necessary.

save_flags(flags);
cli();
/* See if this request has already been queued by an interrupt routine */
if (req && ((req->dev < 0) || (req->dev != dev))) {
restore_flags(flags);
return NULL;
}
#if 1 /* NEW CODE */
if (!SCpnt || SCpnt->request.dev >= 0) { /* Might have changed */
if (wait && SCwait && SCwait->request.dev >= 0) {
sleep_on(&device->device_wait);
restore_flags(flags);
} else {
restore_flags(flags);
if (!wait) return NULL;
if (!SCwait) {
printk("Attempt to allocate device target %d, lun %d\n",
device->id ,device->lun);
panic("No device found in allocate_device\n");
}
}
#else /* ORIGINAL CODE */
if (!SCpnt || SCpnt->request.dev >= 0) /* Might have changed */
{
restore_flags(flags);
if(!wait) return NULL;
if (!SCwait) {
printk("Attempt to allocate device target %d, lun %d\n",
device->id ,device->lun);
panic("No device found in allocate_device\n");
}
SCSI_SLEEP(&device->device_wait,
(SCwait->request.dev > 0));
#endif
} else {


On the other hand, the timeout value for hard disks commands should be
increased, since devices may disconnect for a command,
execute lots of other commands and reconnect for this command several seconds
later.

Assuming that the Queue Algorithm Modifier of the Control Mode Page is zero,
we can use SIMPLE QUEUE TAG for read and write operations.
In such circumstances, I got timeouts while running Bonnie.
When I set the timeout for hards disks to 15 seconds, I dont have any problem
of timeout.

In order to prevent such timeouts, I have added some code into the Bsd2Linux
driver that force usage of ORDERED QUEUE TAG for the next command on a lun
when some "old commands" are still in progress 2 seconds before the requested
timeout. (Will be available in the next version: 1.6)

With this improvement, I never get such timeouts with the standard value of
6 seconds (7 seconds for 1.3.69) for hard disks commands (see sd.c).

I have looked into the standard drivers of linux which may use tagged
command queuing with SIMPLE QUEUE TAG for all operations and I have'nt
see any code that prevents such situations. (advansys, AM53C974, NCR5380)

Have these drivers really been tested with tagged queue enabled?

Regards, Gerard.