Re: [RFC][PATCH v2] slub: Keep page and object in sync inslab_alloc_node()

From: Steven Rostedt
Date: Fri Jan 18 2013 - 10:04:33 EST


On Fri, 2013-01-18 at 14:44 +0000, Christoph Lameter wrote:
> On Thu, 17 Jan 2013, Steven Rostedt wrote:
>
> > In slab_alloc_node(), after the cpu_slab is assigned, if the task is
> > preempted and moves to another CPU, there's nothing keeping the page and
> > object in sync. The -rt kernel crashed because page was NULL and object
> > was not, and the node_match() dereferences page. Even though the crash
> > happened on -rt, there's nothing that's keeping this from happening on
> > mainline.
> >
> > The easiest fix is to disable interrupts for the entire time from
> > acquiring the current CPU cpu_slab and assigning the object and page.
> > After that, it's fine to allow preemption.
>
> Its easiest to just check for the NULL pointer as initally done. The call
> to __slab_alloc can do what the fastpath does.
>
> And the fastpath will verify that the c->page pointer was not changed.

The problem is that the changes can happen on another CPU, which means
that barrier isn't sufficient.

CPU0 CPU1
---- ----
<cpu fetches c->page>
updates c->tid
updates c->page
updates c->freelist
<cpu fetches c->tid>
<cpu fetches c->freelist>

node_match() succeeds even though
current c->page wont

this_cpu_cmpxchg_double() only tests
the object (freelist) and tid, both which
will match, but the page that was tested
isn't the right one.


That barrier() is meaningless as soon as another CPU is involved. The
CPU can order things anyway it wants, even if the assembly did in
differently. Due to cacheline misses and such, we have no idea if
c->page has been prefetched into memory or not.

We may get by with just disabling preemption and testing for page ==
NULL (just in case an interrupt comes in between objects and page and
resets that). But we can't grab freelist and page if c points to another
CPUs object.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/