Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible
From: Yury Norov
Date: Fri Mar 03 2023 - 22:25:12 EST
[...]
> In particular, 'cpumask_clear()' should just zero the cpumask, and on
> the config I use, I have
>
> CONFIG_NR_CPUS=64
>
> so it should literally just be a single "store zero to cpumask word".
> And that's what it used to be.
>
> But then we had commit aa47a7c215e7 ("lib/cpumask: deprecate
> nr_cpumask_bits") and suddenly 'nr_cpumask_bits' isn't a simple
> constant any more for the "small mask that fits on stack" case, and
> instead you end up with code like
>
> movl nr_cpu_ids(%rip), %edx
> addq $63, %rdx
> shrq $3, %rdx
> andl $-8, %edx
> ..
> callq memset@PLT
>
> that does a 8-byte memset because I have 32 cores and 64 threads.
Did you enable CONFIG_FORCE_NR_CPUS? If you pick it, the kernel will
bind nr_cpu_ids to NR_CPUS at compile time, and the memset() call
should disappear.
Depending on your compiler you might want to apply this patch as well:
https://lore.kernel.org/lkml/20221027043810.350460-2-yury.norov@xxxxxxxxx/
> Now, at least some distro kernels seem to be built with CONFIG_MAXSMP,
> so CONFIG_NR_CPUS is something insane (namely 8192), and then it is
> indeed better to calculate some minimum size instead of doing a 1kB
> memset().
Ubuntu too. That was one of the reasons for the patch.
Thanks,
Yury