[PATCH] sched: Fix SCHED_HRTICK bug leading to late preemption of tasks

From: Joonwoo Park
Date: Fri Sep 16 2016 - 21:29:14 EST


From: Srivatsa Vaddagiri <vatsa@xxxxxxxxxxxxxx>

SCHED_HRTICK feature is useful to preempt SCHED_FAIR tasks on-the-dot
(just when they would have exceeded their ideal_runtime). It makes use
of a per-cpu hrtimer resource and hence alarming that hrtimer should
be based on total SCHED_FAIR tasks a cpu has across its various cfs_rqs,
rather than being based on number of tasks in a particular cfs_rq (as
implemented currently). As a result, with current code, its possible for
a running task (which is the sole task in its cfs_rq) to be preempted
much after its ideal_runtime has elapsed, resulting in increased latency
for tasks in other cfs_rq on same cpu.

Fix this by alarming sched hrtimer based on total number of SCHED_FAIR
tasks a CPU has across its various cfs_rqs.

Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
Signed-off-by: Srivatsa Vaddagiri <vatsa@xxxxxxxxxxxxxx>
Signed-off-by: Joonwoo Park <joonwoop@xxxxxxxxxxxxxx>
---

joonwoop: Do we also need to update or remove if-statement inside
hrtick_update()?
I guess not because hrtick_update() doesn't want to start hrtick when cfs_rq
has large number of nr_running where slice is longer than sched_latency.

kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4088eed..c55c566 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4458,7 +4458,7 @@ static void hrtick_start_fair(struct rq *rq, struct task_struct *p)

WARN_ON(task_rq(p) != rq);

- if (cfs_rq->nr_running > 1) {
+ if (rq->cfs.h_nr_running > 1) {
u64 slice = sched_slice(cfs_rq, se);
u64 ran = se->sum_exec_runtime - se->prev_sum_exec_runtime;
s64 delta = slice - ran;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation