[tip:sched/urgent] sched/core: Fix remote wakeups

From: tip-bot for Peter Zijlstra
Date: Wed May 25 2016 - 03:14:49 EST

Commit-ID: b7e7ade34e6188bee2e3b0d42b51d25137d9e2a5
Gitweb: http://git.kernel.org/tip/b7e7ade34e6188bee2e3b0d42b51d25137d9e2a5
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Mon, 23 May 2016 11:19:07 +0200
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Wed, 25 May 2016 08:35:18 +0200

sched/core: Fix remote wakeups


b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on migration")

... introduced a bug: Mike Galbraith found that it introduced a
performance regression, while Paul E. McKenney reported lost
wakeups and bisected it to this commit.

The reason is that I mis-read ttwu_queue() such that I assumed any
wakeup that got a remote queue must have had the task migrated.

Since this is not so; we need to transfer this information between
queueing the wakeup and actually doing the wakeup. Use a new
task_struct::sched_flag for this, we already write to
sched_contributes_to_load in the wakeup path so this is a hot and
modified cacheline.

Reported-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Reported-by: Mike Galbraith <umgwanakikbuti@xxxxxxxxx>
Tested-by: Mike Galbraith <umgwanakikbuti@xxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Fixes: b5179ac70de8 ("sched/fair: Prepare to fix fairness problems on migration")
Link: http://lkml.kernel.org/r/20160523091907.GD15728@xxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
include/linux/sched.h | 1 +
kernel/sched/core.c | 18 +++++++++++-------
2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6cc0df9..e053517 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1533,6 +1533,7 @@ struct task_struct {
unsigned sched_reset_on_fork:1;
unsigned sched_contributes_to_load:1;
unsigned sched_migrated:1;
+ unsigned sched_remote_wakeup:1;
unsigned :0; /* force alignment to the next boundary */

/* unserialized, strictly 'current' */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 404c078..7f2cae4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1768,13 +1768,15 @@ void sched_ttwu_pending(void)
cookie = lockdep_pin_lock(&rq->lock);

while (llist) {
+ int wake_flags = 0;
p = llist_entry(llist, struct task_struct, wake_entry);
llist = llist_next(llist);
- /*
- * See ttwu_queue(); we only call ttwu_queue_remote() when
- * its a x-cpu wakeup.
- */
- ttwu_do_activate(rq, p, WF_MIGRATED, cookie);
+ if (p->sched_remote_wakeup)
+ wake_flags = WF_MIGRATED;
+ ttwu_do_activate(rq, p, wake_flags, cookie);

lockdep_unpin_lock(&rq->lock, cookie);
@@ -1819,10 +1821,12 @@ void scheduler_ipi(void)

-static void ttwu_queue_remote(struct task_struct *p, int cpu)
+static void ttwu_queue_remote(struct task_struct *p, int cpu, int wake_flags)
struct rq *rq = cpu_rq(cpu);

+ p->sched_remote_wakeup = !!(wake_flags & WF_MIGRATED);
if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) {
if (!set_nr_if_polling(rq->idle))
@@ -1869,7 +1873,7 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
#if defined(CONFIG_SMP)
if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), cpu)) {
sched_clock_cpu(cpu); /* sync clocks x-cpu */
- ttwu_queue_remote(p, cpu);
+ ttwu_queue_remote(p, cpu, wake_flags);