[RFC 0/8] CPU reclaiming for SCHED_DEADLINE

From: Luca Abeni
Date: Thu Jan 14 2016 - 10:41:09 EST


Hi all,

this patchset implements CPU reclaiming (using the GRUB algorithm[1])
for SCHED_DEADLINE: basically, this feature allows SCHED_DEADLINE tasks
to consume more than their reserved runtime, up to a maximum fraction
of the CPU time (so that other tasks are left some spare CPU time to
execute), if this does not break the guarantees of other SCHED_DEADLINE
tasks.

I send this RFC because I think the code still needs some work and/or
cleanups (or maybe the patches should be splitted or merged in a different
way), but I'd like to check if there is interest in merging this feature
and if the current implementation strategy is reasonable.

I added in cc the usual people interested in SCHED_DEADLINE patches; if
you think that I should have added someone else, let me know (or please
forward these patches to interested people).

The implemented CPU reclaiming algorithm is based on tracking the
utilization U_act of active tasks (first 5 patches), and modifying the
runtime accounting rule (see patch 0006). The original GRUB algorithm is
modified as described in [2] to support multiple CPUs (the original
algorithm only considered one single CPU, this one tracks U_act per
runqueue) and to leave an "unreclaimable" fraction of CPU time to non
SCHED_DEADLINE tasks (the original algorithm can consume 100% of the CPU
time, starving all the other tasks).

I tried to split the patches so that the whole patchset can be better
understood; if they should be organized in a different way, let me know.
The first 5 patches (tracking of per-runqueue active utilization) can
be useful for frequency scaling too (the tracked "active utilization"
gives a clear hint about how much the core speed can be reduced without
compromising the SCHED_DEADLINE guarantees):
- patches 0001 and 0002 implement a simple tracking of the active
utilization that is too optimistic from the theoretical point of
view
- patch 0003 is mainly useful for debugging this patchset and can
be removed without problems
- patch 0004 implements the "active utilization" tracking algorithm
described in [1,2]. It uses a timer (named "inactive timer" here) to
decrease U_act at the correct time (I called it the "0-lag time").
I am working on an alternative implementation that does not use
additional timers, but it is not ready yet; I'll post it when ready
and tested
- patch 0005 tracks the utilization of the tasks that can execute on
each runqueue. It is a pessimistic approximation of U_act (so, if
used instead of U_act it allows to reclaim less CPU time, but does
not break SCHED_DEADLINE guarantees)
- patches 0006-0008 implement the reclaiming algorithm.

[1] http://retis.sssup.it/~lipari/papers/lipariBaruah2000.pdf
[2] http://disi.unitn.it/~abeni/reclaiming/rtlws14-grub.pdf



Juri Lelli (1):
sched/deadline: add some tracepoints

Luca Abeni (7):
Track the active utilisation
Correctly track the active utilisation for migrating tasks
Improve the tracking of active utilisation
Track the "total rq utilisation" too
GRUB accounting
Make GRUB a task's flag
Do not reclaim the whole CPU bandwidth

include/linux/sched.h | 1 +
include/trace/events/sched.h | 69 ++++++++++++++
include/uapi/linux/sched.h | 1 +
kernel/sched/core.c | 3 +-
kernel/sched/deadline.c | 214 +++++++++++++++++++++++++++++++++++++++++--
kernel/sched/sched.h | 12 +++
6 files changed, 292 insertions(+), 8 deletions(-)

--
1.9.1