Re: All filesystems hang under long periods of heavy load (read andwrite) on a filesystem

From: Marcelo Tosatti
Date: Mon Dec 29 2003 - 14:41:49 EST




On Tue, 23 Dec 2003, Ged Haywood wrote:

> Googled lots of similar postings, if anyone wants me to try anything
> out please let me know. Otherwise this is just a heads up to say that
> there is certainly something nasty in the released 2.4.23 IDE code...

<snip>

> Dec 23 12:19:01 www2 kernel: hdb: status timeout: status=0xd0 { Busy }
> Dec 23 12:19:01 www2 kernel: Dec 23 12:19:01 www2 kernel:
> hdb: no DRQ after issuing WRITE Dec 23 12:19:33 www2 kernel: ide0: reset timed-out,
> status=0x80 Dec 23 12:19:33 www2 kernel: hdb: status timeout:
> status=0xd0 { Busy } Dec 23 12:19:33 www2 kernel: Dec 23 12:19:33 www2
> kernel: hdb: no DRQ after issuing WRITE Dec 23 12:20:03 www2 kernel:
> ide0: reset timed-out, status=0x80 Dec 23 12:20:03 www2 kernel:
> end_request: I/O error, dev 03:41 (hdb), sector 35289088 ... #

Hi Ged,

I believe this migh caused by a problem in ide_wait where the kernel
thinks a timeout has happened but it hasnt in practice (the drive is ready
to go).

The attached patch from Daniel Lux should fix it. Its has just been
applied to the 2.4 BK tree.

Please try it.
===== drivers/ide/ide-iops.c 1.7 vs 1.9 =====
--- 1.7/drivers/ide/ide-iops.c Sat Aug 9 11:48:39 2003
+++ 1.9/drivers/ide/ide-iops.c Sun Dec 28 18:14:05 2003
@@ -664,12 +664,22 @@
if ((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) {
local_irq_set(flags);
timeout += jiffies;
- while ((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) {
+ stat = hwif->INB(IDE_STATUS_REG);
+ while (stat & BUSY_STAT) {
if (time_after(jiffies, timeout)) {
- local_irq_restore(flags);
- *startstop = DRIVER(drive)->error(drive, "status timeout", stat);
- return 1;
+ /*
+ * do one more status read in case we were interrupted between last
+ * stat = hwif->INB(IDE_STATUS_REG) and time_after(jiffies, timeout)
+ * in wich case the timeout might have been shorter than specified.
+ */
+ if ((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) {
+ local_irq_restore(flags);
+ *startstop = DRIVER(drive)->error(drive, "status timeout", stat);
+ return 1;
+ }
}
+ else
+ stat = hwif->INB(IDE_STATUS_REG);
}
local_irq_restore(flags);
}