Re: [PATCH] timers/migration: Temporarily disable per capacity hierarchies

From: Christian Loehle

Date: Tue Jun 16 2026 - 09:34:02 EST


On 6/16/26 14:32, Frederic Weisbecker wrote:
> Le Tue, Jun 09, 2026 at 04:34:17PM +0100, Christian Loehle a écrit :
>> On 6/9/26 13:33, Frederic Weisbecker wrote:
>>> Some workloads with different CPU capacities consume more power with
>>> timer migration than before. The recently introduced per capacity
>>> hierarchies were supposed to alleviate this problem. However it appears
>>> to also regress other types of workloads, especially when plenty of
>>> capacities live together in the same machine.
>>>
>>> Disable the feature until a reasonable solution is found.
>>>
>>> Fixes: 098cbaad8e57 ("timers/migration: Split per-capacity hierarchies")
>>> Reported-by: Christian Loehle <christian.loehle@xxxxxxx>
>>> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
>>> ---
>>> kernel/time/timer_migration.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
>>> index 548d84955f4c..e9d96d96e251 100644
>>> --- a/kernel/time/timer_migration.c
>>> +++ b/kernel/time/timer_migration.c
>>> @@ -1473,7 +1473,7 @@ static unsigned int tmigr_get_capacity(int cpu)
>>> * timekeeper must then belong to the same hierarchy as all the nohz_full
>>> * CPUs. Simply turn off capacity awareness when nohz_full is running.
>>> */
>>> - if (tick_nohz_full_enabled())
>>> + if (tick_nohz_full_enabled() || !IS_ENABLED(CONFIG_BROKEN))
>>> return SCHED_CAPACITY_SCALE;
>>> else
>>> return arch_scale_cpu_capacity(cpu);
>>
>> FWIW
>> Reviewed-by: Christian Loehle <christian.loehle@xxxxxxx>
>> Thanks for looking into this after the late response!
>>
>> I have something based on avg_idle which doesn't look unreasonable on first glance,
>> I'll do some more testing and hopefully post it with some number soon!
>
> Here is another thing we can try.
>
> We can build the hierarchy just like we do with NUMA but instead of
> NUMA nodes, use the capacity (initial idea of Thomas). But connect them
> through a common root.
>
> This is roughly equivalent to per capacity hierarchies, but all those
> hierarchies are connected eventually to the root.
>
> And then upon CPU wakup, actively "steal" the migrator duty from higher
> capacity CPU.
>
> This way lower capacity CPUs have more chances to process global timers
> and they also have more chances to stay active due to that added work
> and also ground work tasks woken up by those timers (hopefully locally).
>
> I can try something like that, I'll just need to use atomic64_t for
> struct tmigr_group::migr_state to store the capacity of the migrator.
>
> Thanks.
>

Sounds reasonable to me, happy to give that a spin!