Another SCHED_DEADLINE bug (with bisection and possible fix)
From: luca abeni
Date: Mon Dec 29 2014 - 18:27:57 EST
Hi all,
when running some experiments on current git master, I noticed a
regression respect to version 3.18 of the kernel: when invoking
sched_setattr() to change the SCHED_DEADLINE parameters of a task that
is already scheduled by SCHED_DEADLINE, it is possible to crash the
system.
The bug can be reproduced with this testcase:
http://disi.unitn.it/~abeni/reclaiming/bug-test.tgz
Uncompress it, enter the "Bug-Test" directory, and type "make test".
After few cycles, my test machine (a laptop with an intel i7 CPU)
becomes unusable, and freezes.
Since I know that 3.18 is not affected by this bug, I tried a bisect,
that pointed to commit 67dfa1b756f250972bde31d65e3f8fde6aeddc5b
(sched/deadline: Implement cancel_dl_timer() to use in
switched_from_dl()).
By looking at that commit, I suspect the problem is that it removes the
following lines from init_dl_task_timer():
- if (hrtimer_active(timer)) {
- hrtimer_try_to_cancel(timer);
- return;
- }
As a result, when changing the parameters of a SCHED_DEADLINE task
init_dl_task_timer() is invoked again, and it can initialize a pending
timer (not sure why, but this seems to be the cause of the freezes I am
seeing).
So, I modified core.c::__setparam_dl() to invoke init_dl_task_timer()
only if the task is not already scheduled by SCHED_DEADLINE...
Basically, I changed
init_dl_task_timer(dl_se);
into
if (p->sched_class != &dl_sched_class) {
init_dl_task_timer(dl_se);
}
I am not sure if this is the correct fix, but with this change the
kernel survives my test script (mentioned above), and arrives to 500
cycles (without my patch, it crashes after 2 or 3 cycles).
What do you think? Is my patch correct, or should I fix the issue in a
different way?
Thanks,
Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/