Re: EXT4-fs error, kernel BUG
From: martin f krafft
Date: Tue Aug 05 2014 - 09:16:10 EST
also sprach Theodore Ts'o <tytso@xxxxxxx> [2014-08-05 14:51 +0200]:
> One likely cause of this issue is that the hardware hiccuped on
> a read, and returned garbage, which is what triggered the "EXT4-fs
> error" message (which is really a report of a detect file system
> inconsistency). A common cause of this is the block address
> getting corrupted, so that the hard drive read the correct data
> from the wrong location.
This sounds like it would happen every time and fsck would catch it.
> The other likely cause is that you are using something like RAID1,
> and the one of copies of the disk block really is corrupted, and
> the kernel read the bad version of the block, but fsck managed to
> read the good version of the block.
it's a RAID10 (using md), so this is a good shot, actually. Which is
bad news for me, because RAID corruption is not nice â when you have
two clocks, you won't know what time it is anymoreâ
Fortunately, I now managed to tar the filesystem content to
elsewhere without error, so in theory all I have to do now is
recreate it. And I'll recreate the filesystem while we're at it.
That should teach RAID10 againâ
I'd still like to drill down to the memory problemâ
> It's possible that this was caused by a memory corruption, but it
> wouldn't have been high on my suspect list. Still, if this is
> a new machine, it might not be a bad idea to run memtest86+ for
> 24-48 hours.
â and will do that. I did it before, but I also just upgraded the
RAM and didn't do it again.
Thank you, tytso. Hope to see you at DC14â
--
@martinkrafft | http://madduck.net/ | http://two.sentenc.es/
"not the truth in whose possession any man is, or thinks he is, but
the honest effort he has made to find out the truth, is what
constitutes the worth of man."
-- gotthold lessing
spamtraps: madduck.bogus@xxxxxxxxxxx
Attachment:
digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/sig-policy/999bbcc4/current)