Re: [PATCH] sched/fair: vruntime should normalize when switching from fair

From: Dietmar Eggemann
Date: Wed Aug 29 2018 - 11:33:38 EST

On 08/29/2018 12:59 PM, Peter Zijlstra wrote:
On Wed, Aug 29, 2018 at 11:54:58AM +0100, Dietmar Eggemann wrote:
I forgot to mention that since fair_task's cpu affinity is restricted to
CPU4, there is no call to set_task_cpu()->migrate_task_rq_fair() since if
(task_cpu(p) != cpu) fails.

I think the combination of cpu affinity of the fair_task to CPU4 and the
fact that the scheduler runs on CPU1 when waking fair_task (with the two
cpus not sharing LLC) while TTWU_QUEUE is enabled is the situation in which
this vruntime issue can happen.

Ohhh, D'0h. A remote wakeup that doesn't migrate.

Ah, there is this WF_MIGRATED flag, perfect for the distinction whether a task migrated or not.

That would suggest something like so:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b39fb596f6c1..b3b62cf37fb6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9638,7 +9638,8 @@ static inline bool vruntime_normalized(struct task_struct *p)
* - A task which has been woken up by try_to_wake_up() and
* waiting for actually being woken up by sched_ttwu_pending().
- if (!se->sum_exec_runtime || p->state == TASK_WAKING)
+ if (!se->sum_exec_runtime ||
+ (p->state == TASK_WAKING && p->sched_remote_wakeup))
return true;
return false;
Yes, this solves the issue for the case I described. Using 'p->sched_remote_wakeup' (WF_MIGRATED) looks more elegant than using 'p->sched_class == &fair_sched_class'.