Hi Subhra,I had tested hackbench on SPARC SMT8 (see numbers in cover letter) and
I ran your patch series on IBM POWER systems and this is what I have observed.
On 6/27/19 6:59 AM, subhra mazumdar wrote:
Rotate the cpu search window for better spread of threads. This will ensureThis leads to a problem of cache hotness.
an idle cpu will quickly be found if one exists.
Signed-off-by: subhra mazumdar <subhra.mazumdar@xxxxxxxxxx>
---
kernel/sched/fair.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b58f08f..c1ca88e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6188,7 +6188,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
u64 avg_cost, avg_idle;
u64 time, cost;
s64 delta;
- int cpu, limit, floor, nr = INT_MAX;
+ int cpu, limit, floor, target_tmp, nr = INT_MAX;
this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
if (!this_sd)
@@ -6219,9 +6219,15 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
}
}
+ if (per_cpu(next_cpu, target) != -1)
+ target_tmp = per_cpu(next_cpu, target);
+ else
+ target_tmp = target;
+
time = local_clock();
- for_each_cpu_wrap(cpu, sched_domain_span(sd), target) {
+ for_each_cpu_wrap(cpu, sched_domain_span(sd), target_tmp) {
+ per_cpu(next_cpu, target) = cpu;
AFAIU, in most cases, `target = prev_cpu` of the task being woken up and
selecting an idle CPU nearest to the prev_cpu is favorable.
But since this doesn't keep track of last idle cpu per task, it fails to find the
nearest possible idle CPU in cases when the task is being woken up after other
scheduled task.