Re: [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity
From: Dietmar Eggemann
Date: Wed Apr 01 2026 - 08:38:50 EST
On 31.03.26 11:04, Andrea Righi wrote:
> Hi Dietmar,
>
> On Tue, Mar 31, 2026 at 12:30:55AM +0200, Dietmar Eggemann wrote:
>> Hi Andrea,
>>
>> On 26.03.26 16:02, Andrea Righi wrote:
[...]
>> So does (2) with NO_SIS_UTIL performs worse than (1) with your smt
>> related add-ons in sic()?
>
> Thanks for running these experiments and sharing the data, this is very
> useful!
>
> I did a quick test on Vera using the NVBLAS benchmark, comparing NO
> ASYM_CPUCAPACITY with and without SIS_UTIL, but the difference seems to be
> within error range. I'll also run DCPerf MediaWiki with all the different
I'm not familiar with the NVBLAS benchmark. Does it drive your system
into 'sd->shared->nr_idle_scan = 0' state?
We just have to understand where this benefit of using sic() instead of
sis() is coming from. I'm doubtful that this is the best_cpu thing after
if (!choose_idle_cpu(cpu, p)) in sic()'s for_each_cpu_wrap(cpu, cpus,
target) loop given that the CPU capacity diffs are so small.
> configurations to see if I get similar results.
>
> More in general, I agree that for small capacity differences (e.g., within
> ~5%) the benefits of using ASYM_CPUCAPACITY is questionable. And I'm also
> fine to go back to the idea of grouping together CPUS within the 5%
> capacity window, if we think it's a safer approach (results in your case
> are quite evident, and BTW, that means we also shouldn't have
> ASYM_CPU_CAPACITY on Grace, so in theory the 5% threshold should also
> improve performance on Grace, that doesn't have SMT).
There shouldn't be so many machines with these binning-introduced small
CPU capacity diffs out there? In fact, I only know about your Grace
(!smt) and Vera (smt) machines.
> That said, I still think there's value in adding SMT awareness to
> select_idle_capacity(). Even if we decide to avoid ASYM_CPUCAPACITY for
> small capacity deltas, we should ensure that the behavior remains
> reasonable if both features are enabled, for any reason. Right now, there
> are cases where the current behavior leads to significant performance
> degradation (~2x), so having a mechanism to prevent clearly suboptimal task
> placement still seems worthwhile. Essentially, what I'm saying is that one
> thing doesn't exclude the other.
IMHO, in case we would know where this improvement is coming from using
sic() instead of default sis() (which already as smt support) then
maybe, it's a lot of extra code at the end ... And mobile big.LITTLE
(with larger CPU capacity diffs) doesn't have smt.