Re: [PATCH] tick/nohz_full: don't abuse smp_call_function_single() in tick_setup_device()

From: Oleg Nesterov
Date: Sat Jun 01 2024 - 10:05:08 EST


Hi Frederic,

First of all, can we please make the additional changes you suggest on top of
this patch? I'd prefer to keep it as simple as possible, I will need to backport
it and I'd like to simplify the internal review.

On 05/30, Frederic Weisbecker wrote:
>
> And after all, pushing a bit further your subsequent patch, can we get rid of
> tick_do_timer_boot_cpu and ifdefery altogether? Such as:

Sure, I thought about this from the very beginning, see
https://lore.kernel.org/all/20240525135120.GA24152@xxxxxxxxxx/
and the changelog in
[PATCH] tick/nohz_full: turn tick_do_timer_boot_cpu into boot_cpu_is_nohz_full
https://lore.kernel.org/all/20240530124032.GA26833@xxxxxxxxxx/
on top of this patch.

And yes, in this case it is better to check that tick_do_timer_cpu != _NONE to
ensure that tick_nohz_full_cpu(tick_cpu) can't crash.

So I considered the change which is very close to yours, except

> + } else if (timekeeper == TICK_DO_TIMER_NONE) {
> + if (WARN_ON_ONCE(tick_nohz_full_enabled()))
> + WRITE_ONCE(tick_do_timer_cpu, cpu);

I don't think we need to change tick_do_timer_cpu in this case.
And I am not sure we need to check tick_nohz_full_enabled() here.
IOW, I was thinking about

if (!td->evtdev) {
int tick_cpu = READ_ONCE(tick_do_timer_cpu);
/*
* If no cpu took the do_timer update, assign it to
* this cpu:
*/
if (tick_cpu == TICK_DO_TIMER_BOOT) {
WRITE_ONCE(tick_do_timer_cpu, cpu);
tick_next_period = ktime_get();
/*
* The boot CPU may be nohz_full, in which case the
* first housekeeping secondary will take do_timer()
* from us.
*/
} else if (!WARN_ON_ONCE(tick_cpu == TICK_DO_TIMER_NONE)) &&
tick_nohz_full_cpu(tick_cpu) &&
!tick_nohz_full_cpu(cpu)) {
/*
* The boot CPU will stay in periodic (NOHZ disabled)
* mode until clocksource_done_booting() called after
* smp_init() selects a high resolution clocksource and
* timekeeping_notify() kicks the NOHZ stuff alive.
*
* So this WRITE_ONCE can only race with the READ_ONCE
* check in tick_periodic() but this race is harmless.
*/
WRITE_ONCE(tick_do_timer_cpu, cpu);
}

But you know, somehow I like
[PATCH] tick/nohz_full: turn tick_do_timer_boot_cpu into boot_cpu_is_nohz_full
https://lore.kernel.org/all/20240530124032.GA26833@xxxxxxxxxx/
a bit more, to me the code looks more understandable this way.

Note that this patch doesn't really need to keep #ifdef CONFIG_NO_HZ_FULL,

if (!td->evtdev) {
static bool boot_cpu_is_nohz_full;
/*
* If no cpu took the do_timer update, assign it to
* this cpu:
*/
if (READ_ONCE(tick_do_timer_cpu) == TICK_DO_TIMER_BOOT) {
WRITE_ONCE(tick_do_timer_cpu, cpu);
tick_next_period = ktime_get();
/*
* The boot CPU may be nohz_full, in which case the
* first housekeeping secondary will take do_timer()
* from us.
*/
boot_cpu_is_nohz_full = tick_nohz_full_cpu(cpu);
} else if (boot_cpu_is_nohz_full && !tick_nohz_full_cpu(cpu)) {
boot_cpu_is_nohz_full = false;
/*
* The boot CPU will stay in periodic (NOHZ disabled)
* mode until clocksource_done_booting() called after
* smp_init() selects a high resolution clocksource and
* timekeeping_notify() kicks the NOHZ stuff alive.
*
* So this WRITE_ONCE can only race with the READ_ONCE
* check in tick_periodic() but this race is harmless.
*/
WRITE_ONCE(tick_do_timer_cpu, cpu);
}

should work without #ifdef.

In this case I don't think we need the _NONE check, tick_sched_do_timer() will
complain.

But I won't argue. I will be happy to make V2 which follows your recommendations
but again, can I do this on top of this patch?

Oleg.