Re: [PATCH] sched/isolation: fix boot crash when maxcpus < first-housekeeping-cpu

From: Frederic Weisbecker
Date: Thu Apr 18 2024 - 11:41:06 EST


Le Sat, Apr 13, 2024 at 04:17:46PM +0200, Oleg Nesterov a écrit :
> housekeeping_setup() checks cpumask_intersects(present, online) to ensure
> that the kernel will have at least one housekeeping CPU after smp_init(),
> but this doesn't work if the maxcpus= kernel parameter limits the number
> of processors available after bootup.
>
> For example, the kernel with "maxcpus=2 nohz_full=0-2" parameters crashes
> at boot time on my virtual machine with 4 CPUs.
>
> Change housekeeping_setup() to use cpumask_first_and() and check that the
> returned cpu number is valid and less than setup_max_cpus.
>
> Another corner case is "nohz_full=0" on a machine with a single CPU or
> with the maxcpus=1 kernel argument. In this case non_housekeeping_mask
> is empty and IIUC tick_nohz_full_setup() makes no sense. And indeed, the
> kernel hits the WARN_ON(tick_nohz_full_running) in tick_sched_do_timer().
>
> And how should the kernel interpret the "nohz_full=" parameter? I think
> it should be silently ignored, but currently cpulist_parse() happily
> returns the empty cpumask and this leads to the same problem.
>
> Change housekeeping_setup() to check cpumask_empty(non_housekeeping_mask)
> and do nothing in this case.
>
> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>

Acked-by: Frederic Weisbecker <frederic@xxxxxxxxxx>