[RFC][RFT][PATCH 0/3] arm64: Enable asympacking for minor CPPC asymmetry

From: Christian Loehle

Date: Wed Mar 25 2026 - 14:13:38 EST


The scheduler currently handles CPU performance asymmetry via either:

- SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT)
- SD_ASYM_CPUCAPACITY: capacity-aware scheduling

On arm64, capacity-aware scheduling is used for any detected capacity
differences.

Some systems expose small per-CPU performance differences via CPPC
highest_perf (e.g. due to chip binning), resulting in slightly different
capacities (<~5%). These differences are sufficient to trigger
SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively
symmetric.

For such small deltas, capacity-aware scheduling is unnecessarily
complex. A simpler priority-based approach, similar to x86 ITMT, is
sufficient.

This series introduces support for using asymmetric packing in that case:

- derive per-CPU priorities from CPPC highest_perf
- detect when CPUs differ but not enough to form distinct capacity classes
- suppress SD_ASYM_CPUCAPACITY for such domains
- enable SD_ASYM_PACKING and use CPPC-based priority ordering instead

The asympacking flag is exposed at all topology levels; domains with
equal priorities are unaffected, while domains spanning CPUs with
different priorities can honor the ordering.

RFC:
I'm not entirely sure if this is the best way to implement this.
Currently this is baked into CPPC and arm64, while neither are strictly
necessary, we could also use cpu_capacity directly to derive the
ordering and enable this for non-CPPC and/or non-arm64.
RFT:
Andrea, please give this a try. This should perform better in particular
for single-threaded workloads and workloads that do not utilize all
cores (all the time anyway).
Capacity-aware scheduling wakeup works very different to the SMP path
used now, some workloads will benefit, some regress, it would be nice
to get some test results for these.
We already discussed DCPerf MediaWiki seems to benefit from
capacity-aware scheduling wakeup behavior, but others (most?) should
benefit from this series.

I don't know if we can also be clever about ordering amongst SMT siblings.
That would be dependent on the uarch and I don't have a platform to
experiment with this though, so consider this series orthogonal to the
idle-core SMT considerations.
On platforms with SMT though asympacking makes a lot more sense than
capacity-aware scheduling, because arguing about capacity without
considering utilization of the sibling(s) (and the resulting potential
'stolen' capacity we perceive) isn't theoretically sound.

Christian Loehle (3):
sched/topology: Introduce arch hooks for asympacking
arch_topology: Export CPPC-based asympacking prios
arm64/sched: Enable CPPC-based asympacking

arch/arm64/include/asm/topology.h | 6 +++++
arch/arm64/kernel/topology.c | 34 ++++++++++++++++++++++++++
drivers/base/arch_topology.c | 40 +++++++++++++++++++++++++++++++
include/linux/arch_topology.h | 24 +++++++++++++++++++
include/linux/sched/topology.h | 9 +++++++
kernel/sched/fair.c | 16 -------------
kernel/sched/topology.c | 34 ++++++++++++++++++++------
7 files changed, 140 insertions(+), 23 deletions(-)

--
2.34.1