Re: [PATCH] sched: Optimize pick_next_task for idle_sched_class too

From: Peter Zijlstra
Date: Wed Mar 01 2017 - 11:45:04 EST


On Wed, Mar 01, 2017 at 10:53:03AM -0500, Steven Rostedt wrote:
> Peter, do we have a solution for this yet? Are you going to add the one
> with the linker magic?

I queued the below earlier today.

---
Subject: sched: Fix pick_next_task() for RT,DL
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Wed Mar 1 10:51:47 CET 2017

Pavan noticed that commit 49ee576809d8 ("sched/core: Optimize
pick_next_task() for idle_sched_class") broke RT,DL balancing by
robbing them of the opportinty to do new-'idle' balancing when their
last runnable task (on that runqueue) goes away.

Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Reported-by: Pavan Kondeti <pkondeti@xxxxxxxxxxxxxx>
Fixes: 49ee576809d8 ("sched/core: Optimize pick_next_task() for idle_sched_class")
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3273,10 +3273,15 @@ pick_next_task(struct rq *rq, struct tas
struct task_struct *p;

/*
- * Optimization: we know that if all tasks are in
- * the fair class we can call that function directly:
+ * Optimization: we know that if all tasks are in the fair class we can
+ * call that function directly, but only if the @pref task wasn't of a
+ * higher scheduling class, because otherwise those loose the
+ * opportunity to pull in more work from other CPUs.
*/
- if (likely(rq->nr_running == rq->cfs.h_nr_running)) {
+ if (likely((prev->sched_class == &idle_sched_class ||
+ prev->sched_class == &fair_sched_class) &&
+ rq->nr_running == rq->cfs.h_nr_running)) {
+
p = fair_sched_class.pick_next_task(rq, prev, rf);
if (unlikely(p == RETRY_TASK))
goto again;