Re: FS Corruption in 2.1.109 (fwd)

Alan Cox (alan@lxorguk.ukuu.org.uk)
Wed, 29 Jul 1998 20:31:11 +0100 (BST)


> > It might be better to regard this as an SMP issue,
> > unless you can get DMA to fail on UP.
>
> It's been tested. Alan has numbers, and it does not happen only on UP.
> Note also how he cannot even _enable_ DMA on 2.0.x.

[That one is atypical btw in that respect]

> No wonder 2.0.x doesn't corrupt things, it doesn't even try to use DMA.

The numbers dont back that one up either.

> Maybe this is why the corruption thing hasn't been reported very much for
> older kernels?

I set out to prove their was a simple blacklist of UDMA drives. Apart from
WDC - which may be purely by volume of drives made I see nothing but
randomness.

Similarly while it appears there may be a general SMP problem there is no
clear SMP / UP correlation

I also don't believe it to be hardware. Some of these folks have Win98
and FreeBSD running reliably in UDMA mode - two folks have gone and stress
tested on both systems without problems.

To me that leaves a 2.1 driver or OS block device layer bug, the 2.0 reports
and other OS's working back that up. Mark's bug, Linus' bug adding the io
request locks and local cli/sti, something else - who knows. But since you
do know the 2.0 one is much more stable you have something to go on.

Also since its timing related the conclusion I have to reach and be very
concerned about is that once 2.1.x/2.2 goes out to a large body as is now
even with DMA commented out (which is what I would call the "ostrich" not
other developers) we are going to leave a trail of trashed file systems
in our wake PIO or DMA

So we need to find the real cause. If its hardware we have to prove its
a hardware problem. I should have asked for chipsets in hindsight but I
didnt.

Calling people names doesn't help however Linus, nor does using capital
letters a lot.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html