Re: [RFC] issue with cpumask for UniProcessor

From: Sander Vanheule
Date: Mon Aug 08 2022 - 13:35:05 EST


Hi Saurabh,

On Mon, 2022-08-08 at 09:23 -0700, Saurabh Sengar wrote:
>
> Hi,
>
> I am working on a UniProcessor system with latest linux-next kernel
> (20220803).
> I observed two files "shared_cpu_map” and “shared_cpu_list” are missing
> for L3 cache (/sys/devices/system/cpu/cpu0/cache/index3). This causes lscpu
> version 2.34 to segfault. On further digging I figured below is the commit
> which introduced this problem.
>
> https://lore.kernel.org/lkml/e78c55ecb98172356248a7a89da501479ead6ae0.1659077534.git.sander@xxxxxxxxxxxxx/
>

This is the v5 of the patch, which sadly isn't the version that got merged. The
commit that's triggering your issue is b81dce77cedc ("cpumask: Fix invalid
uniprocessor mask assumption"), which is patch v4.

https://lore.kernel.org/lkml/86bf3f005abba2d92120ddd0809235cab4f759a6.1656777646.git.sander@xxxxxxxxxxxxx/

>
> I am not 100% certain what the proper fix for it is, but below changes fix
> this issue. I understand above patch is already confirmed for linux kernel
> 6.0, please suggest if we need fixing this in 6.0.
>
> Regards,
> Saurabh
>
>
>
> diff --git a/lib/cpumask.c b/lib/cpumask.c
> index b9728513a4d4..81fc2e35b5b1 100644
> --- a/lib/cpumask.c
> +++ b/lib/cpumask.c
> @@ -16,10 +16,14 @@
>   */
>  unsigned int cpumask_next(int n, const struct cpumask *srcp)
>  {
> +#if NR_CPUS == 1
> +       return n+1;
> +#else

This is ignoring the provided cpumask again, which was exactly what my patch
fixed. If the mask is empty, then cpumask_next(-1, mask) should return (at
least) 1, not 0.

I think the problem could be caused by cpumask_next() getting an empty mask.
Then the real issue is would be that a certain mask is empty when it shouldn't
be, which was compensated by the old code's built-in assumption that a cpumask
couldn't be empty.

My MIPS testing system doesn't have these L3 maps, and "shared_cpu_map" and
"shared_cpu_list" are present for index0 and index1. I would propose that you
look for the point where the files should be created, and check how
cpumask_next() is involved, to find the actual cause of this problem.

Best,
Sander

>         /* -1 is a legal arg here. */
>         if (n != -1)
>                 cpumask_check(n);
>         return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
> +#endif
>  }
>  EXPORT_SYMBOL(cpumask_next);