Re: 2.1.111: IDE DMA disabled...BLAH...BLAH...

Mike Jagdis (
Wed, 29 Jul 1998 17:03:44 +0100 (GMT/BST)

On Wed, 29 Jul 1998, Alan Cox wrote:

> Linus - Yes there is something wrong
> No it cannot be DMA if DMA goes back to 1.3.20, almost every report
> says it started in 2.1.x, most are not UDMA but older DMA
> The only common theme right now seems to be SMP and SCSI, and thats SMP
> machines in the SCSI case (not option), unknown in the IDE case.

My system: 2 x DPT2044UW, 1 x Future Domain PCI, 2 IDE drives each
on a different bus. Have had crashes seemingly due to memory corruption.
Once root IDE drive (IBM ~6.4GB) was corrupted, twice the first oops
was in find_buffer (quickly followed by _lots_ more oopses), once
in tcp_close.

Initially using Balles* EATA driver crashes would happen under
load (ckraid operations, copying large trees) but only intermittently.
Using eata_dma driver _seems_ fine under load. Since then crashes
have seemed related to load on the Future Domain - but not simple
load (such as streaming through the Exabyte), it seems there has
to be significant other SCSI activity. 20-30GB backup/restore and
probably one crash will have happened.

Was using 2.1.107 SMP (this is a single PII-400, BX, no IO-APIC).
Today was using 2.1.107 *UP* with IDE DMA on both drives *disabled*
with hdparm when it happened. Tonight will switch the Future Domain
for another DPT.

1. Anybody using a Future Domain successfully? (later crashes may
be _solely_ down to the FD)

2. Anybody using the Balles* EATA driver successfully? (initial
crashes happened before I added the FD)

3. Anybody using multiple SCSI adapters under simultaneous load
successfully? (All crashes seem related to simultaneous load on
multiple adapters)

It is possible that both the Balles* EATA driver and the Future
Domain driver have race conditions which break them eventually.
However, my current feeling is that the race is higher level
and loading four or five devices on two or three controllers
just makes it more likely.

Why does eata_dma work? Maybe its timing is different enough to
avoid the race - so far? Maybe its the only driver of the three
that isn't broken?

I'm not seeing IDE problems (but then I'm not stressing the IDE
at all). I have merrily swapped cables, memory, drives etc. to
eliminate as many of the atoms as possible.


P.S. No, I can't remember how to Balles*. Sorry :-).

