Re: Disk Corruption with ide hpt-366 controller & software raid5

Rogier Wolff (R.E.Wolff@BitWizard.nl)
Tue, 14 Sep 1999 08:34:16 +0200 (MEST)


Someone whose name I snipped wrote...
> > > dd if=/dev/md0 count=500000 2> /dev/null | md5sum
> > > done
> > 08d3b2b34dfc667ca96c549f8a8a3c15 -
> > cee7aa5dd1ee81ff63a93bba3830ca31 -
> > a577d2d50f9ebc535b9e49905c29631c -
> > f8c6aea89094543aaf2982ef6504285d -
> > 596f99e3047d18eef9798634a091670b -

> > [root@music /root]# cmp -l test1 test2
> > 6031611 243 253 [ed note: binary: 10100011 10101011]
> > 6031612 323 134 [ed note: binary: 11010011 01011100]
> > 6293755 256 246 [ed note: binary: 10101110 10100110]
> > 6293756 261 76 [ed note: binary: 10110001 00111110]
> > 7641340 102 166 [ed note: binary: 01000010 01110110]
> > 7660796 147 255 [ed note: binary: 01100111 10101101]

If you're seeing THIS kind of errors, it SURE looks like a hardware
issue. If the software is making errors, I'd expect a random byte
inserted somewhere. A block of data shifted one byte. A whole block
corrupted (delivered to the wrong address in memory). Things like
that.

If it's usually "bit 3" that's wrong, it seems that a cable is too
long according to spec, a timing issue on the card etc. etc. Something
like that. Not a software error.

To me that "diff" shows too much correlation between the differing
values. These are NOT random bytes dropped onto the good data by a
wild pointer or something like that.

Roger.

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
------ Microsoft SELLS you Windows, Linux GIVES you the whole house ------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/