Re: [PATCH 1/2] timers/migration: Fix hotplug migrator selection target on asymetric capacity machines

From: Frederic Weisbecker

Date: Mon Jun 08 2026 - 10:06:17 EST


Le Mon, Jun 08, 2026 at 11:45:37AM +0200, Marek Szyprowski a écrit :
> Dear All,
>
> On 20.05.2026 00:09, Frederic Weisbecker wrote:
> > When a top-level migrator is deactivated, either at CPU down hotplug
> > time or when a CPU is domain isolated, a new migrator is elected among
> > the available CPUs and woken up to take over the migration duty.
> >
> > However that election must happen at the scope of a given hierarchy and
> > not globally, which the introduction of per-capacity hierarchies failed
> > to handle.
> >
> > As a result a given hierarchy may end up without migrator to handle
> > global timers.
> >
> > Fix it with making sure that the new migrator belongs to the same
> > hierarchy as the outgoing CPU.
> >
> > Fixes: 098cbaad8e57 ("timers/migration: Split per-capacity hierarchies")
> > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
>
> This patch landed recently in linux-next as commit e4a70f5fbd43 ("timers/migration:
> Fix hotplug migrator selection target on asymetric capacity machines"). In my tests
> I found that it breaks system suspend/resume on some legacy big.LITTLE ARM machines.
>
>
> Reverting $subject, together with dependent commit d4f198c13611 ("timers/migration:
> Deactivate per-capacity hierarchies under nohz_full") on top of linux-next fixes
> this issue. Here is the log from the system suspend/resume failure introduced by
> the $subject patch:
>
>
> root@target:~# time rtcwake -s10 -mmem
> rtcwake: wakeup from "mem" using /dev/rtc0 at Mon Jun  8 11:17:23 2026
> PM: suspend entry (deep)
> Filesystems sync: 0.000 seconds
> Freezing user space processes
> Freezing user space processes completed (elapsed 0.002 seconds)
> OOM killer disabled.
> Freezing remaining freezable tasks
> Freezing remaining freezable tasks completed (elapsed 0.042 seconds)
> printk: Suspending console(s) (use no_console_suspend to debug)
> ...
> Disabling non-boot CPUs ...
> ------------[ cut here ]------------
> WARNING: kernel/time/timer_migration.c:1505 at
> tmigr_clear_cpu_available+0x3b8/0x3c8, CPU#5: cpuhp/5/40

Thanks but which tree is this? The only warning I see there is on line 1521
1532 (tip:timers/core).

It's probably line 1521 somehow. Is it possible that arch_scale_cpu_capacity()
returns a different result between CPU boot up and CPU down?

Thanks.

--
Frederic Weisbecker
SUSE Labs