Re: [PATCH] x86/smpboot: Add map vars allocation check in smp_prepare_cpus_common

From: Ingo Molnar
Date: Wed Apr 10 2024 - 09:30:38 EST



* Nikita Kiryushin <kiryushin@xxxxxxxx> wrote:

> As of now, zalloc_cpumask_var for various maps in smp_prepare_cpus_common
> is not checked.
>
> If allocation fails, it will not be known, unless the not-allocated map
> will be accessed. The situation seems not very realistic now, but could
> get more relevant in the future, as number of cores (and amount of
> allocated resources) grows.
>
> Add a cumulative status for all zalloc_cpumask_var() calls in
> smp_prepare_cpus_common() and error message in case the status signals
> that any of the map var allocations failed (per cpu).
>
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
>
> Signed-off-by: Nikita Kiryushin <kiryushin@xxxxxxxx>
> ---
> arch/x86/kernel/smpboot.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 76bb65045c64..3b24c2e1fa3b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1042,11 +1042,16 @@ void __init smp_prepare_cpus_common(void)
> }
>
> for_each_possible_cpu(i) {
> - zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
> - zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
> - zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL);
> - zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
> - zalloc_cpumask_var(&per_cpu(cpu_l2c_shared_map, i), GFP_KERNEL);
> + bool ret = true;
> +
> + ret &= zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
> + ret &= zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
> + ret &= zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL);
> + ret &= zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
> + ret &= zalloc_cpumask_var(&per_cpu(cpu_l2c_shared_map, i), GFP_KERNEL);
> +
> + if (!ret)
> + pr_err("Failed to allocate map for CPU%u\n", i);

So:

- That doesn't really solve anything, nor does it propagate the error
further up. Plus memory allocation failures within __init functions for
key CPU data structures are invariably fatal. While there might be
more cores in the future - but there will be even more RAM. This error
condition will never be realistic.

- The canonical arch behavior for __init functions is to return -ENOMEM
and not printk anything. But that's not really an option for
smp_prepare_cpus_common(), which feeds back into the
::smp_prepare_cpus() callback that doesn't really expect failure either.

My suggestion would be to simply pass in __GFP_NOFAIL to document that
there's no reasonable allocation failure policy here. That's better than
needlessly complicating this code.

Also note that this code has changed in the latest x86 tree (tip:master).

Thanks,

Ingo