Re: [patch 2/2] slub: enforce cpuset restrictions for cpu slabs

From: David Rientjes
Date: Tue Mar 03 2009 - 12:19:57 EST


On Tue, 3 Mar 2009, Christoph Lameter wrote:

> > Slab allocations should respect cpuset hardwall restrictions. Otherwise,
> > it is possible for tasks in a cpuset to fill slabs allocated on mems
> > assigned to a disjoint cpuset.
>
> Not sure that I understand this correctly. If multiple tasks are running
> on the same processor that are part of disjoint cpusets and both taska are
> performing slab allocations without specifying a node then one task could
> allocate a page from the first cpuset, take one object from it and then
> the second task on the same cpu could consume the rest from a nodeset that
> it would otherwise not be allowed to access. On the other hand it is
> likely that the second task will also allocate memory from its allowed
> nodes that are then consumed by the first task. This is a tradeoff coming
> with the pushing of the enforcement of memory policy / cpuset stuff out of
> the slab allocator and relying for this on the page allocator.
>

Yes, I agree that it's a significant optimization to allow the cpu slab to
be used by tasks that are not allowed, either because of its mempolicy or
cpuset restriction, to access the node on which it was allocated. That's
especially true for small object sizes or short-lived allocations where
the hardwall infringment is acceptable for the speed-up.

Unfortunately, it also leads to a violation of the user imposed
restriction on acceptable memory usage. One of the important aspects of
cpusets is to allow memory isolation from other siblings. It should be
possible to kill all tasks in a cpuset, for example, and expect its
partial list to be emptied and not heavily fragmented by long-lived
allocations that could prevent any partial slab freeing, which is possible
when heavy slab users are allowed to allocate objects anywhere.

> > If an allocation is intended for a particular node that the task does not
> > have access to because of its cpuset, an allowed partial slab is used
> > instead of failing.
>
> This would get us back to the slab allocator enforcing memory policies.
>

Is that a problem? get_any_partial() already enforces cpuset-aware memory
policies when defragmenting remote partial slabs.

> > -static inline int node_match(struct kmem_cache_cpu *c, int node)
> > +static inline int node_match(struct kmem_cache_cpu *c, int node, gfp_t gfpflags)
> > {
> > #ifdef CONFIG_NUMA
> > if (node != -1 && c->node != node)
> > return 0;
> > #endif
> > - return 1;
> > + return cpuset_node_allowed_hardwall(c->node, gfpflags);
> > }
>
> This is a hotpath function and doing an expensive function call here would
> significantly impact performance.
>

It's not expensive. It's a no-op for !CONFIG_CPUSETS configs and only a
global variable read for machines running with a single cpuset. When the
machine has multiple cpusets, it indicates that memory restrictions are in
place so checking current->mems_allowed is required and its performance
impact should be assumed.

> It also will cause a reloading of the per cpu slab after each task switch
> in the scenario discussed above.
>

There is no alternative solution to prevent egregious amounts of slab to
be allocated in a disjoint cpuset that is supposedly mem_exclusive.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/