Re: [RFC][RFT][PATCH 0/3] arm64: Enable asympacking for minor CPPC asymmetry
From: Vincent Guittot
Date: Thu Mar 26 2026 - 03:57:41 EST
On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@xxxxxxx> wrote:
>
> The scheduler currently handles CPU performance asymmetry via either:
>
> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT)
> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling
>
> On arm64, capacity-aware scheduling is used for any detected capacity
> differences.
>
> Some systems expose small per-CPU performance differences via CPPC
> highest_perf (e.g. due to chip binning), resulting in slightly different
> capacities (<~5%). These differences are sufficient to trigger
> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively
> symmetric.
>
> For such small deltas, capacity-aware scheduling is unnecessarily
> complex. A simpler priority-based approach, similar to x86 ITMT, is
> sufficient.
I'm not convinced that moving to SD_ASYM_PACKING is the right way to
move forward.
1st of all, do you target all kind of system or only SMT? It's not
clear in your cover letter
Moving on asym pack for !SMT doesn't make sense to me. If you don't
want EAS enabled, you can disable it with
/proc/sys/kernel/sched_energy_aware
For SMT system and small capacity difference, I would prefer that we
look at supporting SMT in SD_ASYM_CPUCAPACITY. Starting with
select_idle_capacity
>
> This series introduces support for using asymmetric packing in that case:
>
> - derive per-CPU priorities from CPPC highest_perf
> - detect when CPUs differ but not enough to form distinct capacity classes
> - suppress SD_ASYM_CPUCAPACITY for such domains
> - enable SD_ASYM_PACKING and use CPPC-based priority ordering instead
>
> The asympacking flag is exposed at all topology levels; domains with
> equal priorities are unaffected, while domains spanning CPUs with
> different priorities can honor the ordering.
>
> RFC:
> I'm not entirely sure if this is the best way to implement this.
> Currently this is baked into CPPC and arm64, while neither are strictly
> necessary, we could also use cpu_capacity directly to derive the
> ordering and enable this for non-CPPC and/or non-arm64.
> RFT:
> Andrea, please give this a try. This should perform better in particular
> for single-threaded workloads and workloads that do not utilize all
> cores (all the time anyway).
> Capacity-aware scheduling wakeup works very different to the SMP path
> used now, some workloads will benefit, some regress, it would be nice
> to get some test results for these.
> We already discussed DCPerf MediaWiki seems to benefit from
> capacity-aware scheduling wakeup behavior, but others (most?) should
> benefit from this series.
>
> I don't know if we can also be clever about ordering amongst SMT siblings.
> That would be dependent on the uarch and I don't have a platform to
> experiment with this though, so consider this series orthogonal to the
> idle-core SMT considerations.
> On platforms with SMT though asympacking makes a lot more sense than
> capacity-aware scheduling, because arguing about capacity without
> considering utilization of the sibling(s) (and the resulting potential
> 'stolen' capacity we perceive) isn't theoretically sound.
>
> Christian Loehle (3):
> sched/topology: Introduce arch hooks for asympacking
> arch_topology: Export CPPC-based asympacking prios
> arm64/sched: Enable CPPC-based asympacking
>
> arch/arm64/include/asm/topology.h | 6 +++++
> arch/arm64/kernel/topology.c | 34 ++++++++++++++++++++++++++
> drivers/base/arch_topology.c | 40 +++++++++++++++++++++++++++++++
> include/linux/arch_topology.h | 24 +++++++++++++++++++
> include/linux/sched/topology.h | 9 +++++++
> kernel/sched/fair.c | 16 -------------
> kernel/sched/topology.c | 34 ++++++++++++++++++++------
> 7 files changed, 140 insertions(+), 23 deletions(-)
>
> --
> 2.34.1
>