Re: 2.4.13-ac7 ext3 OOPS

From: Gavin Baker (gavbaker@ntlworld.com)
Date: Sat Nov 24 2001 - 10:55:16 EST


On Thu, Nov 22, 2001 at 08:24:32PM +0000, Stephen C. Tweedie wrote:

> Hi,
>
> On Sun, Nov 18, 2001 at 08:50:39PM +0000, Gavin Baker wrote:
> > A seemingly random OOPS while using netscape. 2.4.13-ac7 with preempt patches on a RH7.2 laptop.
>
> > Nov 18 13:12:45 n-files kernel: EIP: 0010:[get_hash_table+107/208] Not tainted
>
> get_hash_table oopses are almost always caused by bad memory. For
> some reason, the buffer cache hashes are peculiarly sensitive to
> corruptions.

This is definately likely, the machine has a cheap no-name memory
module in it.

> > Nov 18 13:12:45 n-files kernel: eax: c1430000 ebx: ffffffff ecx: 00000002 edx: 00003859
>
> It's not enough to be conclusive, but the other common footprint of
> random memory corruption is register dumps containing a value which is
> all-zeroes except for one flipped bit, like your 0x00000002 value in
> %ecx.

This is another factoid for my notebook. :)

> Let me know if you can reproduce this, but in the absense of any other
> pattern, bad memory is the most likely cause for now.

I cannot reproduce this, i've tried stressing the filesystem but the
oops does not show up. I will cc you if one shows up again.

I've tried memtest86 on the machine, and while it doesn't spot any
errors, it does seem to take an extremly long time to do some of the
tests (The default tests are taking more than 12 hours). I will replace
the memory with some decent branded ram just in case.

Thanks for your time Stephen,

Regards, Gavin Baker

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Nov 30 2001 - 21:00:17 EST