Re: [bug] SLOB crash, 2.6.24-rc2

From: Matt Mackall
Date: Wed Nov 14 2007 - 14:43:51 EST


On Wed, Nov 14, 2007 at 08:05:01PM +0100, Ingo Molnar wrote:
>
> * Matt Mackall <mpm@xxxxxxxxxxx> wrote:
>
> > > > [ 61.245190] rc.sysinit used greatest stack depth: 1680 bytes left
> > > > [ 61.386859] list_add corruption. prev->next should be next (407d973c), but was 418cf818. (prev=41877098).
> > > > [ 61.396328] ------------[ cut here ]------------
> > > > [ 61.400910] kernel BUG at lib/list_debug.c:33!
> > > > [ 61.405330] invalid opcode: 0000 [#1] DEBUG_PAGEALLOC
> > > >
> > > > looks like memory corruption of some sort and it's reproducible. Picking
> > > > CONFIG_SLUB makes the crash go away. Booting v2.6.23 with the same
> > > > .config works fine.
> > >
> > > Hmmm, the changes in SLOB since v2.6.23 are all trivial. I'll try to
> > > reproduce it with your config, but it doesn't seem promising.
> >
> > Couldn't reproduce it here, let me know if you get anywhere with your
> > bisect.
>
> the bug went away - and the only thing i did was a networking config
> tweak. So maybe something in networking corrupts memory? I'm not sure i
> can restore the old state. (i had lots of problems with net interface
> renaming not working in .24)

Quite possible. SLOB is more sensitive to off by one bugs because it
doesn't have the power-of-two buckets that SLAB/SLUB have. IIRC,
SLAB/SLUB's debugging features won't detect when you request 28 bytes,
get 32, then overwrite byte 29. But that will damage other objects or
the free list in SLOB.

But this isn't the per-page SLOB list that's getting clobbered, this
is the list of pages held in struct page.

--
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/