Re: [patch 2/2] slub: add min_partial sysfs tunable

From: David Rientjes
Date: Mon Feb 23 2009 - 04:58:24 EST


On Mon, 23 Feb 2009, Pekka Enberg wrote:

> The patches look good but the description is bit lacking. Does this
> actually fix up something? Why don't we fix the limit calculations
> instead?
>
> I'm a sucker for numbers so I'm easily fooled into merging patches with
> statements of the form "this shaves off N bytes/kb/mb on XYZ systems".
>

The memory savings from simply moving min_partial from struct
kmem_cache_node to struct kmem_cache is obviously not significant (unless
maybe you're from SGI or something), at the largest it's

# allocated caches * (MAX_NUMNODES - 1) * sizeof(unsigned long)

The true savings occurs when userspace reduces the number of partial slabs
that would otherwise be wasted, especially on machines with a large
number of nodes (ia64 with CONFIG_NODES_SHIFT at 10 for default?). As
well as the kernel estimates ideal values for n->min_partial and ensures
it's within a sane range, userspace has no other input other than writing
to /sys/kernel/slab/cache/shrink.

There simply isn't any better heuristic to add when calculating the
partial values for a better estimate that works for all possible caches.
And since it's currently a static value, the user really has no way of
reclaiming that wasted space, which can be significant when constrained by
a cgroup (either cpusets or, later, memory controller slab limits) without
shrinking it entirely.

This also allows the user to specify that increased fragmentation and more
partial slabs are actually desired to avoid the cost of allocating new
slabs at runtime for specific caches.

There's also no reason why this should be a per-struct kmem_cache_node
value in the first place. You could argue that a machine would have such
node size asymmetries that it should be specified on a per-node basis, but
we know nobody is doing that right now since it's a purely static value at
the moment and there's no convenient way to tune that via slub's sysfs
interface.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/