Re: slab corruption with current -git
From: Linus Torvalds
Date: Mon Oct 10 2016 - 12:15:17 EST
On Mon, Oct 10, 2016 at 1:24 AM, David Miller <davem@xxxxxxxxxxxxx> wrote:
>
> So I've been reviewing this patch and it looks fine, but I also want
> to figure out what is actually causing the OOPS and I can't spot it
> yet.
Yeah, I'm not actually sure the old linked list implementation is
buggy - it might just be ugly. I tried to follow the old code, and I
couldn't.
So the patch I sent out was a combination of "that's not how you
should do singly linked lists" and "those special cases make me
worry". In particular, the old code really ended up doing odd things
in the "can't find entry" case, because it would exit the loop with a
non-NULL 'entry' just because the next entry was NULL..
> One possible way to see that oops is to free the head entry of the
> chain without unlinking it. The next unregister will dereference a
> POISON pointer.
>
> Actually...
>
> The POISON value comes not from a hook entry, but from the array of
> pointers in the per-netns datastructure.
>
> This means that the netns is possibly getting freed up before we
> unregister the netfilter hooks.
That is certainly one possible explanation for it, yes. However, I
didn't think that part had changed, had it?
The other thing I find a bit odd in that new single-linked list code is this:
- nf_hook_slow():
...
RCU_INIT_POINTER(state->hook_entries, entry);
which makes me worried.. It copies the head entry of the list, and
maybe it will then (later) end up being used stale. I don't know. But
it looks a bit iffy.
Linus