> It seems that there are bits falling over in memory on your
> machine. So fsck counts the bits in a bitmap block, and comes one
> short. THen it erroneously corrects the master count, and corrects
> the miscorrection on the next run.
any suggestion how to test such possible memory errors ?
compiling kernel or X11 multiple times (multiple parallel sessions)
don't show any problem. memtest86 doesn't seem to do a good job
in detecting e.g. refresh problems or similar where you have to wait
a long time without touching memory before errors occur.
On Jul 08, Ted Deppner wrote:
> I've seen this type of stuff happen from IDE drives running too fast (DMA
> problems), from bad cables on IDE (longer than 18"), and from poorly terminated
> scsi busses (especially Adaptec).
I doubt so (in my case). this is only a single internal cable with two
separate active terminators.
> Harald, I'd suggest giving the following a try. If you don't get the same
> sums each time, you've got a hardware problem... whether drive, cable,
> controller, or bad memory is left as an excerise to the reader. :)
>
> dd if=/dev/sda bs=1024k count=400 | md5sum
> # read 400mb off the disk, give us a checksum
>
> Run that 3 or more times. If the sums change you may have a problem. This
> works best on unmounted drives of course.
I've already done similar tests with multiple md5sum runs on all files on that fs
with no error at all. now I've done ~20 runs on the raw partition (2.6GB), all
checksums are identical.
Harald
-- All SCSI disks will from now on ___ _____ be required to send an email notice 0--,| /OOOOOOO\ 24 hours prior to complete hardware failure! <_/ / /OOOOOOOOOOO\ \ \/OOOOOOOOOOOOOOO\ \ OOOOOOOOOOOOOOOOO|// Harald Koenig, \/\/\/\/\/\/\/\/\/ Inst.f.Theoret.Astrophysik // / \\ \ koenig@tat.physik.uni-tuebingen.de ^^^^^ ^^^^^- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/