Re: no DRQ after issuing WRITE was Re: 2.4.23-uv3 patch set released

From: Marcelo Tosatti
Date: Tue Dec 30 2003 - 17:38:53 EST




On Tue, 30 Dec 2003, Linus Torvalds wrote:

>
>
> On Tue, 30 Dec 2003, Marcelo Tosatti wrote:
> >
> > Small correction: people are not hitting the WAIT_READY (they are hitting
> > the problem from ide-disk.c, which uses WAIT_DRQ). But still...
>
> Ok. Do you have the full trace? In particular, if there is no locking in
> that path, and interrupts are enabled, you could possibly get not just an
> interrupt, but a preemption event. Now _that_ could blow up the timeout to
> any amount of time, and then even 100ms might not be enough.

The problem is happening in 2.4 too so I believe preemption is not the
culprit. Here are some details:

steve@xxxxxxxxxxxxx wrote:

"Well i only just started getting them and i started with 2.4.20 and
upgraded to 2.4.21 about 6weeks or so ago (or when it came out)"

"hda: status timeout: status=0xd0 { Busy }
hda: no DRQ after issuing WRITE
ide0: reset: success
hda: status timeout: status=0xd0 { Busy }
hda: no DRQ after issuing WRITE
ide0: reset: success"

daniel@xxxxxxxxxxxxxx wrote:

"hda: no DRQ after issuing WRITE
ide0: reset: success
hda: status timeout: status=0xd0 { Busy }

hda: no DRQ after issuing WRITE
ide0: reset: success"

(Daniel wrote the patch which got applied to 2.4, it fixed the problems
for him).

There are several other reports of "no DRQ after issuing {MULTI}WRITE",
some of them probably involved with this bug, some of them potentially
not. You can find more reports (both from 2.6 and 2.4) at:

http://marc.theaimsgroup.com/?l=linux-kernel&w=2&r=1&s=no+DRQ+after+issuing+WRITE&q=b

> Is CONFIG_PREEMPT on in the cases, and is there really no locking
> anywhere? Preempting in the middle of the data transfer phase sounds like
> a fundamentally bad idea, and maybe the code needs a few preempt
> disable/enable pairs somewhere?

>From my fast code read, there is no other locking involved.

It sounds you are right, the timeout is too small --- we need confirmation
from the people who can hit it that increasing it fixes the problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/