Re: amd64 sata_nv (massive) memory corruption

From: Alistair John Strachan
Date: Fri Aug 01 2008 - 18:19:21 EST


On Friday 01 August 2008 18:30:34 Linas Vepstas wrote:
> Hi,
>
> I'm seeing strong, easily reproducible (and silent) corruption on a
> sata-attached
> disk drive on an amd64 board. It might be the disk itself, but I
> doubt it; googling
> suggests that its somehow iommu-related but I cannot confirm this.

Nowhere do you explicitly say you have memtest86'ed the RAM. Checking 4GB of
RAM will take some time (probably several hours) but it will mostly eliminate
bad memory as the cause of the corruption.

IME these kinds of bugs are almost always bad RAM. Since the part of the RAM
that is bad may never be used by kernel code, you may experience no crashes.
This is especially true of machines with a lot of RAM. However since your
filesystem cache can easily consume all 4GB over time, you could see this kind
of corruption when copying files.

--
Cheers,
Alistair.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/