[tip:sched/core] sched/deadline: Fix a bug in dl_overflow()

From: tip-bot for Xunlei Pang
Date: Sat Apr 23 2016 - 09:00:44 EST

Commit-ID: fec148c000d0f9ac21679601722811eb60b4cc52
Gitweb: http://git.kernel.org/tip/fec148c000d0f9ac21679601722811eb60b4cc52
Author: Xunlei Pang <xlpang@xxxxxxxxxx>
AuthorDate: Thu, 14 Apr 2016 20:19:28 +0800
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Sat, 23 Apr 2016 14:20:43 +0200

sched/deadline: Fix a bug in dl_overflow()

I got a minus(very big) dl_b->total_bw during my deadline tests.

# grep dl /proc/sched_debug
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -222297900

Something unusual must have happened.

After some digging, I finally noticed that when changing a deadline
task to normal(cfs), and changing it back to deadline immediately,
after it died, we will got the wrong dl_bw->total_bw.

The root cause is in dl_overflow(), it has:
if (new_bw == p->dl.dl_bw)
return 0;

1) When a deadline task is changed to !deadline task, it will start
dl timer in switched_from_dl(), and retain previous deadline parameter
till the timer expires.

2) If we change it back to deadline with the same bandwidth parameter
before the timer expires, as it keeps the old bandwidth although it
is not a deadline task. dl_overflow() simply returns success without
updating the right data, and got the wrong dl_bw->total_bw.

The solution is simple, if @p is not deadline, don't return.

Signed-off-by: Xunlei Pang <xlpang@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Acked-by: Juri Lelli <juri.lelli@xxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1460636368-1993-1-git-send-email-xlpang@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
kernel/sched/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 71dffbb..9d84d60 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2378,7 +2378,8 @@ static int dl_overflow(struct task_struct *p, int policy,
u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0;
int cpus, err = -1;

- if (new_bw == p->dl.dl_bw)
+ /* !deadline task may carry old deadline bandwidth */
+ if (new_bw == p->dl.dl_bw && task_has_dl_policy(p))
return 0;