Re: [PATCH 1/2] timers/migration: Fix hotplug migrator selection target on asymetric capacity machines
From: Marek Szyprowski
Date: Mon Jun 08 2026 - 10:25:47 EST
On 08.06.2026 16:04, Frederic Weisbecker wrote:
> Le Mon, Jun 08, 2026 at 11:45:37AM +0200, Marek Szyprowski a écrit :
>> On 20.05.2026 00:09, Frederic Weisbecker wrote:
>>> When a top-level migrator is deactivated, either at CPU down hotplug
>>> time or when a CPU is domain isolated, a new migrator is elected among
>>> the available CPUs and woken up to take over the migration duty.
>>>
>>> However that election must happen at the scope of a given hierarchy and
>>> not globally, which the introduction of per-capacity hierarchies failed
>>> to handle.
>>>
>>> As a result a given hierarchy may end up without migrator to handle
>>> global timers.
>>>
>>> Fix it with making sure that the new migrator belongs to the same
>>> hierarchy as the outgoing CPU.
>>>
>>> Fixes: 098cbaad8e57 ("timers/migration: Split per-capacity hierarchies")
>>> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
>> This patch landed recently in linux-next as commit e4a70f5fbd43 ("timers/migration:
>> Fix hotplug migrator selection target on asymetric capacity machines"). In my tests
>> I found that it breaks system suspend/resume on some legacy big.LITTLE ARM machines.
>>
>>
>> Reverting $subject, together with dependent commit d4f198c13611 ("timers/migration:
>> Deactivate per-capacity hierarchies under nohz_full") on top of linux-next fixes
>> this issue. Here is the log from the system suspend/resume failure introduced by
>> the $subject patch:
>>
>>
>> root@target:~# time rtcwake -s10 -mmem
>> rtcwake: wakeup from "mem" using /dev/rtc0 at Mon Jun 8 11:17:23 2026
>> PM: suspend entry (deep)
>> Filesystems sync: 0.000 seconds
>> Freezing user space processes
>> Freezing user space processes completed (elapsed 0.002 seconds)
>> OOM killer disabled.
>> Freezing remaining freezable tasks
>> Freezing remaining freezable tasks completed (elapsed 0.042 seconds)
>> printk: Suspending console(s) (use no_console_suspend to debug)
>> ...
>> Disabling non-boot CPUs ...
>> ------------[ cut here ]------------
>> WARNING: kernel/time/timer_migration.c:1505 at
>> tmigr_clear_cpu_available+0x3b8/0x3c8, CPU#5: cpuhp/5/40
> Thanks but which tree is this? The only warning I see there is on line 1521
> 1532 (tip:timers/core).
>
> It's probably line 1521 somehow. Is it possible that arch_scale_cpu_capacity()
> returns a different result between CPU boot up and CPU down?
The log has been captured on the kernel compiled from the e4a70f5fbd43 commit.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland