Re: [PATCH] sched/isolation: fix boot crash when the boot CPU is nohz_full

From: Oleg Nesterov
Date: Mon Apr 22 2024 - 14:52:11 EST


Phil, Frederic,

Thanks for your review! Who do you think can take these patches?
At least the 1st one.

To remind, there are more problems with boot CPU in nohz mask, but
can we at least fix the kernel crash?

Oleg.

On 04/18, Phil Auld wrote:
>
> On Thu, Apr 11, 2024 at 04:39:05PM +0200 Oleg Nesterov wrote:
> > Documentation/timers/no_hz.rst states that the "nohz_full=" mask must not
> > include the boot CPU, this is no longer true after the commit 08ae95f4fd3b
> > ("nohz_full: Allow the boot CPU to be nohz_full").
> >
> > However after another commit aae17ebb53cd ("workqueue: Avoid using isolated
> > cpus' timers on queue_delayed_work") the kernel will crash at boot time in
> > this case; housekeeping_any_cpu() returns an invalid cpu nr until smp_init()
> > paths bring the 1st housekeeping CPU up.
> >
> > Change housekeeping_any_cpu() to check the result of cpumask_any_and() and
> > return smp_processor_id() in this case. Yes, this is just the simple and
> > backportable workaround which fixes the symptom, but smp_processor_id() at
> > boot time should be safe at least for type == HK_TYPE_TIMER, this more or
> > less matches the tick_do_timer_boot_cpu logic.
> >
> > We should not worry about cpu_down(); tick_nohz_cpu_down() will not allow
> > to offline tick_do_timer_cpu (the 1st online housekeeping CPU).
> >
> > Fixes: aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work")
> > Reported-by: Chris von Recklinghausen <crecklin@xxxxxxxxxx>
> > Closes: https://lore.kernel.org/all/20240402105847.GA24832@xxxxxxxxxx/
> > Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
>
> Checking the returned value instead of assuming seems safer in any case.
>
> Reviewed-by: Phil Auld <pauld@xxxxxxxxxx>
>
> > ---
> > Documentation/timers/no_hz.rst | 7 ++-----
> > kernel/sched/isolation.c | 11 ++++++++++-
> > 2 files changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/timers/no_hz.rst b/Documentation/timers/no_hz.rst
> > index f8786be15183..7fe8ef9718d8 100644
> > --- a/Documentation/timers/no_hz.rst
> > +++ b/Documentation/timers/no_hz.rst
> > @@ -129,11 +129,8 @@ adaptive-tick CPUs: At least one non-adaptive-tick CPU must remain
> > online to handle timekeeping tasks in order to ensure that system
> > calls like gettimeofday() returns accurate values on adaptive-tick CPUs.
> > (This is not an issue for CONFIG_NO_HZ_IDLE=y because there are no running
> > -user processes to observe slight drifts in clock rate.) Therefore, the
> > -boot CPU is prohibited from entering adaptive-ticks mode. Specifying a
> > -"nohz_full=" mask that includes the boot CPU will result in a boot-time
> > -error message, and the boot CPU will be removed from the mask. Note that
> > -this means that your system must have at least two CPUs in order for
> > +user processes to observe slight drifts in clock rate.) Note that this
> > +means that your system must have at least two CPUs in order for
> > CONFIG_NO_HZ_FULL=y to do anything for you.
> >
> > Finally, adaptive-ticks CPUs must have their RCU callbacks offloaded.
> > diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> > index 373d42c707bc..2a262d3ecb3d 100644
> > --- a/kernel/sched/isolation.c
> > +++ b/kernel/sched/isolation.c
> > @@ -46,7 +46,16 @@ int housekeeping_any_cpu(enum hk_type type)
> > if (cpu < nr_cpu_ids)
> > return cpu;
> >
> > - return cpumask_any_and(housekeeping.cpumasks[type], cpu_online_mask);
> > + cpu = cpumask_any_and(housekeeping.cpumasks[type], cpu_online_mask);
> > + if (likely(cpu < nr_cpu_ids))
> > + return cpu;
> > + /*
> > + * Unless we have another problem this can only happen
> > + * at boot time before start_secondary() brings the 1st
> > + * housekeeping CPU up.
> > + */
> > + WARN_ON_ONCE(system_state == SYSTEM_RUNNING ||
> > + type != HK_TYPE_TIMER);
> > }
> > }
> > return smp_processor_id();
> > --
> > 2.25.1.362.g51ebf55
> >
> >
> >
>
> --
>