Re: Random GCC segfaults -- Was: [2.6.16] slab error in slab_destroy_objs(): cache `radix_tree_node'...

From: Pavel Machek
Date: Tue Mar 28 2006 - 08:51:16 EST


Hi!

> On Tue, 28 Mar 2006 00:41:37 -0800
> Andrew Morton <akpm@xxxxxxxx> wrote:
>
> > If those errors had no corresponding kernel messages then what you have is
> > a classic symptom of failing memory hardware. Suggest you grab memtest86,
> > run it for 24 hours.
>
> I've already run memtest86+ for hours (not 24 ok... "only" 4/5h) and I
> found this:
>
> An easly reproducilble memory failure (single bit flipping always at
> the same address) <---- this one goes AWAY disabling bank interleaving
> in BIOS.
>
> Another memory failure (different address, always one bit flipping)
> isn't found by memtest86+: I found it with CONFIG_DEBUG_SLAB and
> I "fixed" it with memmap=... boot option.
>
>
> Now, these 2 problems are both in my first 256MB memory module, so maybe
> it is really another memory failure.
>
> BUT now that I'm back on 2.6.15.6 I'm compiling a LOT of big CPP
> projects and I haven't seen a single GCC segfault yet.
>
> Maybe I should retry with 2.6.16 and if I can reproduce the problem I
> can start testing 2.6.16-rc1 and so on...

I'd really get new RAM... If the machine is "known bad", debugging on
it is likely waste of time.
Pavel
--
Picture of sleeping (Linux) penguin wanted...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/