Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle_sibling using SMT balance
From: Peter Zijlstra
Date: Mon Feb 05 2018 - 07:20:02 EST
On Fri, Feb 02, 2018 at 09:37:02AM -0800, Subhra Mazumdar wrote:
> In the scheme of SMT balance, if the idle cpu search is done _not_ in the
> last run core, then we need a random cpu to start from. If the idle cpu
> search is done in the last run core we can start the search from last run
> cpu. Since we need the random index for the first case I just did it for
> both.
That shouldn't be too hard to fix. I think we can simply transpose the
CPU number. That is, something like:
cpu' = core'_id + (cpu - core_id)
should work for most sane cases. We don't give any guarantees this will
in fact work, but (almost) all actual CPU enumeration schemes I've seen
this should work for.
And if it doesn't work, we're not worse of than we are now.
I just couldn't readily find a place where we need to do this for cores
with the current code. But I think we have one place between LLCs where
it can be done:
---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7b6535987500..eb8b8d0a026c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6109,7 +6109,7 @@ static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int t
if (!static_branch_likely(&sched_smt_present))
return -1;
- for_each_cpu(cpu, cpu_smt_mask(target)) {
+ for_each_cpu_wrap(cpu, cpu_smt_mask(target), target) {
if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
continue;
if (idle_cpu(cpu))
@@ -6357,8 +6357,17 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f
if (cpu == prev_cpu)
goto pick_cpu;
- if (wake_affine(affine_sd, p, prev_cpu, sync))
- new_cpu = cpu;
+ if (wake_affine(affine_sd, p, prev_cpu, sync)) {
+ /*
+ * Transpose prev_cpu's offset into this cpu's
+ * LLC domain to retain the 'random' search offset
+ * for for_each_cpu_wrap().
+ */
+ new_cpu = per_cpu(sd_llc_id, cpu) +
+ (prev_cpu - per_cpu(sd_llc_id, prev_cpu));
+ if (unlikely(!cpus_share_cache(new_cpu, cpu)))
+ new_cpu = cpu;
+ }
}
if (sd && !(sd_flag & SD_BALANCE_FORK)) {