2.4.0-test10pre5: still IDE lockups on HPT366 controller.

From: Winfried Truemper (winni@xpilot.org)
Date: Tue Oct 24 2000 - 19:32:45 EST


Hi Andre,

I tried 2.4.0-test10pre5 and gives me far less errors. It is evil because
of the upcoming hope after 2 minutes, but then it bites you. The errors on
ide0 and ide1 are gone completely. I have switched DMA on for them.
However, the machine still freezes solid on heavy use of ide2 and ide3
with unmaskirq=on and using_dma=on. Even an fsck is enough. The following
error messages appear, but do not necessarily when the machine freezes:

APIC error on CPU1 00(02) or 02(02) or 00(08) or 00(04)

CPU0 gives the same errors, but the two processors never give the same
message. I could see no rule. And after some time:

hde: (ide_dma_lostirq) reg50h=0x33, reg52h=0x00, reg5ah=0x01
ide_dmaproc: chipset supported ide_dma_lostirq func only: 13
hde: lost interrupt

and while fsck()ing:

hdg: timeout waiting for DMA
ide_dmaproc: chipset supported ide_dma_timeout func only: 14

Four dd's started on the raw partition completed with only a few
APIC-error messages successfully after about three hours. Bonnie on a
formatted disk partition on hda and hdc works, but it kills the machine if
tested on hde and hdg (bonnie starts with writing single characters
to the filesystem, not sequential data to the block device like I
did). Same is true for a software RAID array build from all four disks.

When I switch off DMA and unmarkirq=off for ide2 and ide3, then the
machine lives longer, but I get the following errors:

hdg: status error: status=0x58 { DriveReady SeekComplete DataRequest }
ide3: reset: master: error: (0x00?)
hdg: lost interrupt
ide2: reset: success

Performance suffers a lot because (?) ide2 and ide3 use the same
interrupt, but block each other with unmaskirq=off. Example: benchmark
numbers with bonnie drop from many megabytes to several hundred
kilobytes (*ouch*). And the disk "feels" like this, its not just the
numbers.

The machine has four IDE ports on the motherboard, two are UDMA33,
two are UDMA66 via an integrated HPT66 controller. There are four
brand new Samsung drives connected to the IDE ports. Motherboard:
Abit BP6, two celerons @366 Mhz, not yet overclocked.

Regards
-Winfried

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Oct 31 2000 - 21:00:14 EST