Re: [PATCH 1/2] timer: add add_timer_active_cpu()

From: Thomas Gleixner

Date: Tue May 05 2026 - 17:33:44 EST

On Thu, Apr 23 2026 at 09:19, Partha Satapathy wrote:
> From: Partha Sarathi Satapathy <partha.satapathy@xxxxxxxxxx>
>
> add_timer_on() can queue a timer on a CPU that goes offline before the
> timer is enqueued. When that happens, the timer remains unserviced until
> the CPU comes back online.
>
> Callers can try to avoid that by checking CPU state themselves, but that
> does not close the race with CPU hotplug. Taking the hotplug lock around
> every enqueue is also too expensive for users that only want CPU-local
> placement as a performance hint.
>
> Add add_timer_active_cpu() for callers that want to queue a timer on a

Aside of Frederic's comments this is a patently bad function name as it
suggests that the timer should be placed on any active CPU, which makes
it more confusing as the function has a @cpu argument.

> @@ -1296,7 +1297,7 @@ EXPORT_SYMBOL(add_timer_global);
> *
> * See add_timer() for further details.
> */
> -void add_timer_on(struct timer_list *timer, int cpu)
> +static void __add_timer_on(struct timer_list *timer, int cpu, bool need_ol_cpu)

How is the kernel-doc comment still valid for that function?

Also 'need_ol_cpu' is the most ridiculous argument name I've seen in a
long time. My first read was: "Need [good] ol' CPU"

> {
> struct timer_base *new_base, *base;
> unsigned long flags;
> @@ -1333,6 +1334,18 @@ void add_timer_on(struct timer_list *timer, int cpu)
> WRITE_ONCE(timer->flags,
> (timer->flags & ~TIMER_BASEMASK) | cpu);
> }
> +#ifdef CONFIG_HOTPLUG_CPU
> + if (need_ol_cpu) {

This #ifdeffery is pointless and can be replaced with

if (IS_ENABLED(CONFIG...) && ...)

But that's just a cosmetic detail.

> + if (!base->is_active) {
> + raw_spin_unlock(&base->lock);
> + base = this_cpu_ptr(&timer_bases[BASE_LOCAL]);
> + raw_spin_lock(&base->lock);
> + cpu = smp_processor_id();
> + WRITE_ONCE(timer->flags,
> + (timer->flags & ~TIMER_BASEMASK) | cpu);

This is broken vs. the TIMER_MIGRATION semantics.

> +void add_timer_on(struct timer_list *timer, int cpu)
> +{
> + bool need_ol_cpu = false;
> +
> + __add_timer_on(timer, cpu, need_ol_cpu);
> +}
> EXPORT_SYMBOL_GPL(add_timer_on);
>
> +/**
> + * add_timer_active_cpu - Start a timer on a particular CPU if online or current
> + * @timer: The timer to be started
> + * @cpu: The CPU to start it on
> + *
> + * This is like add_timer_on(), except that it queues the timer on the
> + * given CPU only when that CPU is online or on the current CPU.
> + */
> +void add_timer_active_cpu(struct timer_list *timer, int cpu)
> +{
> + bool need_ol_cpu = true;
> +
> + __add_timer_on(timer, cpu, need_ol_cpu);
> +}
> +EXPORT_SYMBOL_GPL(add_timer_active_cpu);

We surely need more exports just for adding a misnamed argument.

> +
> /**
> * __timer_delete - Internal function: Deactivate a timer
> * @timer: The timer to be deactivated
> @@ -2507,6 +2543,7 @@ int timers_prepare_cpu(unsigned int cpu)
> base->next_expiry_recalc = false;
> base->timers_pending = false;
> base->is_idle = false;
> + base->is_active = true;

That's just wrong. The base is active only when the CPU reaches online state.

> @@ -2535,6 +2572,11 @@ int timers_dead_cpu(unsigned int cpu)
>
> WARN_ON_ONCE(old_base->running_timer);
> old_base->running_timer = NULL;
> + /*
> + * Make the dead CPU base unavailable to add_timer_on()
> + * when the caller wants an active target CPU.
> + */
> + old_base->is_active = false;

After a CPU went offline no timer should be queued on it anymore and
add_timer_on() clearly lacks a

if (WARN_ON(cpu_offline(cpu)))
return;

If at all what you really want is an interface, which

1) allows to give a hint about the CPU to place the timer on

2) checks whether the hinted CPU is online (or even active) and places
if on it if so

3) if #2 fails picks a CPU on the same node when available (online or
active)

4) Picks any online CPU (maybe prefer the calling CPU)

But let me look at your argument you provided to Frederic:

> Commit 4c00f3b0dea023a9851908a501735c899b0772f9
> ("net/rds: Add preferred_cpu option to rds_rdma.ko")
> added `preferred_cpu` so follow-up work can be placed for locality and
> performance, not because correctness requires that exact CPU.

If RDS fails to update the preferred CPU logic once that CPU goes
offline, then that's a clear bug in that code and should not be worked
around elsewhere.

Thanks,

tglx