[RFC PATCH 6/7] sched/fair: Split select_task_rq_fair want_affine logic

From: Valentin Schneider
Date: Wed Dec 11 2019 - 11:44:56 EST


The domain loop within select_task_rq_fair() depends on a few bits of
input, namely the SD flag we're looking for and whether we want_affine.

For !want_affine, the domain loop will walk up the hierarchy to reach the
highest domain with both SD_LOAD_BALANCE and the requested sd_flag
(SD_BALANCE_{WAKE, FORK, EXEC}) set.
In other words, that's a call to highest_flags_domain() for these two
flags. Note that this is a static information wrt a given SD hierarchy,
so we can cache that - but that comes in a later patch to ease reviewing.

For want_affine, we'll walk up the hierarchy to reach the first domain
with SD_LOAD_BALANCE, SD_WAKE_AFFINE, and that spans the tasks's prev_cpu.
We still save a pointer to the last visited domain that had the requested
sd_flag set (and SD_LOAD_BALANCE), which means that if we fail to go
through the affine condition (e.g. no domain had SD_WAKE_AFFINE) we'll use
the same SD as we would have found if we had !want_affine.

Split the domain loop in !want_affine and want_affine paths. As it is,
this leads to two domain walks instead of a single one, but stay tuned for
the next patch.

Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
---
kernel/sched/fair.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 30e8d357a24f..ea875c7c82d7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6370,29 +6370,36 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
}

rcu_read_lock();
+
+ sd = highest_flags_domain(cpu, sd_flag | SD_LOAD_BALANCE);
+
+ /*
+ * If !want_affine, we just look for the highest domain where
+ * sd_flag is set.
+ */
+ if (!want_affine)
+ goto scan;
+
+ /*
+ * Otherwise we look for the lowest domain with SD_WAKE_AFFINE and that
+ * spans both 'cpu' and 'prev_cpu'.
+ */
for_each_domain(cpu, tmp) {
if (!(tmp->flags & SD_LOAD_BALANCE))
break;

- /*
- * If both 'cpu' and 'prev_cpu' are part of this domain,
- * cpu is a valid SD_WAKE_AFFINE target.
- */
- if (want_affine && (tmp->flags & SD_WAKE_AFFINE) &&
+ if ((tmp->flags & SD_WAKE_AFFINE) &&
cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) {
if (cpu != prev_cpu)
new_cpu = wake_affine(tmp, p, cpu, prev_cpu, sync);

- sd = NULL; /* Prefer wake_affine over balance flags */
+ /* Prefer wake_affine over SD lookup */
+ sd = NULL;
break;
}
-
- if (tmp->flags & sd_flag)
- sd = tmp;
- else if (!want_affine)
- break;
}

+scan:
if (unlikely(sd)) {
/* Slow path */
new_cpu = find_idlest_cpu(sd, p, cpu, prev_cpu, sd_flag);
--
2.24.0