Re: [PATCH] cpumask: Create dedicated kmem cache for cpumask var

From: Yury Norov
Date: Wed Mar 20 2024 - 13:47:04 EST


On Wed, Mar 20, 2024 at 10:03:02AM +0100, Rasmus Villemoes wrote:
> On 19/03/2024 13.24, Dawei Li wrote:
> > alloc_cpumask_var_node() and friends allocate cpumask var dynamically
> > for CONFIG_CPUMASK_OFFSTACK=y kernel. The allocated size of cpumask var
> > is cpumask_size(), which is runtime constant after nr_cpu_ids is
> > freezed.
> >
> > Create a dedicated kmem cache for dynamic allocation of cpumask var.
>
> Why?

Hi Dawei,

Agree with Rasmus. CPUMASK_OFFSTACK=y is quite a rare configuration,
normally disabled. Can you show the architecture that you're working
with and how the cache you're adding affects performance.

> > The window for creation of cache is somewhat narrow:
> > - After last update of nr_cpu_ids(via set_nr_cpu_ids())
> > - After kmem cache is available.
> > - Before any alloc_cpumask_var_node() invocations(sched_init() e.g).

Not only narrow but also not uniform across platforms. For example,
on XEN xen_smp_count_cpus() may adjust nr_cpu_ids. I don't think that
people runn XEN with CPUMASK_OFFSTACK=y, but you have to make sure
that your cache is always created before the 1st allocation.

> OK, so this sounds somewhat fragile. It's maybe correct, but I fail to
> see what is gained by this, and the commit message does not provide any
> hints.

Agree. To make it less vulnerable, you have to enforce something like:

bool cpumask_cache_used = false;

static inline void set_nr_cpu_ids(unsigned int nr)
{
if (WARN_ON(cpumask_cache_used))
return;

nr_cpu_ids = nr;
cpumask_cache_destroy();
cpumask_cache_init()
}

bool alloc_cpumask_var_node()
{
cpumask_cache_used = true;
*mask = kmalloc_node(cpumask_size(), flags, node);
...
}

But at the very first, we need to understand under which scenarios the
new cache would benefit performance?

Thnaks,
Yury
benefits performance