[PATCH 10/10] sched/deadline: documentation about GRUB reclaiming

From: luca abeni
Date: Thu May 18 2017 - 16:16:00 EST


From: Claudio Scordino <claudio@xxxxxxxxxxxxxxx>

This patch adds the documentation about the GRUB reclaiming algorithm,
adding a few details discussed in list.

Signed-off-by: Claudio Scordino <claudio@xxxxxxxxxxxxxxx>
Signed-off-by: Luca Abeni <luca.abeni@xxxxxxxxxxxxxxx>
---
Documentation/scheduler/sched-deadline.txt | 168 +++++++++++++++++++++++++++++
1 file changed, 168 insertions(+)

diff --git a/Documentation/scheduler/sched-deadline.txt b/Documentation/scheduler/sched-deadline.txt
index cbc1b46..e89e36e 100644
--- a/Documentation/scheduler/sched-deadline.txt
+++ b/Documentation/scheduler/sched-deadline.txt
@@ -7,6 +7,8 @@ CONTENTS
0. WARNING
1. Overview
2. Scheduling algorithm
+ 2.1 Main algorithm
+ 2.2 Bandwidth reclaiming
3. Scheduling Real-Time Tasks
3.1 Definitions
3.2 Schedulability Analysis for Uniprocessor Systems
@@ -44,6 +46,9 @@ CONTENTS
2. Scheduling algorithm
==================

+2.1 Main algorithm
+------------------
+
SCHED_DEADLINE uses three parameters, named "runtime", "period", and
"deadline", to schedule tasks. A SCHED_DEADLINE task should receive
"runtime" microseconds of execution time every "period" microseconds, and
@@ -113,6 +118,160 @@ CONTENTS
remaining runtime = remaining runtime + runtime


+2.2 Bandwidth reclaiming
+------------------------
+
+ Bandwidth reclaiming for deadline tasks is based on the GRUB (Greedy
+ Reclamation of Unused Bandwidth) algorithm [15, 16, 17] and it is enabled
+ when flag SCHED_FLAG_RECLAIM is set.
+
+ The following diagram illustrates the state names for tasks handled by GRUB:
+
+ ------------
+ (d) | Active |
+ ------------->| |
+ | | Contending |
+ | ------------
+ | A |
+ ---------- | |
+ | | | |
+ | Inactive | |(b) | (a)
+ | | | |
+ ---------- | |
+ A | V
+ | ------------
+ | | Active |
+ --------------| Non |
+ (c) | Contending |
+ ------------
+
+ A task can be in one of the following states:
+
+ - ActiveContending: if it is ready for execution (or executing);
+
+ - ActiveNonContending: if it just blocked and has not yet surpassed the 0-lag
+ time;
+
+ - Inactive: if it is blocked and has surpassed the 0-lag time.
+
+ State transitions:
+
+ (a) When a task blocks, it does not become immediately inactive since its
+ bandwidth cannot be immediately reclaimed without breaking the
+ real-time guarantees. It therefore enters a transitional state called
+ ActiveNonContending. The scheduler arms the "inactive timer" to fire at
+ the 0-lag time, when the task's bandwidth can be reclaimed without
+ breaking the real-time guarantees.
+
+ The 0-lag time for a task entering the ActiveNonContending state is
+ computed as
+
+ (runtime * dl_period)
+ deadline - ---------------------
+ dl_runtime
+
+ where runtime is the remaining runtime, while dl_runtime and dl_period
+ are the reservation parameters.
+
+ (b) If the task wakes up before the inactive timer fires, the task re-enters
+ the ActiveContending state and the "inactive timer" is canceled.
+ In addition, if the task wakes up on a different runqueue, then
+ the task's utilization must be removed from the previous runqueue's active
+ utilization and must be added to the new runqueue's active utilization.
+ In order to avoid races between a task waking up on a runqueue while the
+ "inactive timer" is running on a different CPU, the "dl_non_contending"
+ flag is used to indicate that a task is not on a runqueue but is active
+ (so, the flag is set when the task blocks and is cleared when the
+ "inactive timer" fires or when the task wakes up).
+
+ (c) When the "inactive timer" fires, the task enters the Inactive state and
+ its utilization is removed from the runqueue's active utilization.
+
+ (d) When an inactive task wakes up, it enters the ActiveContending state and
+ its utilization is added to the active utilization of the runqueue where
+ it has been enqueued.
+
+ For each runqueue, the algorithm GRUB keeps track of two different bandwidths:
+
+ - Active bandwidth (running_bw): this is the sum of the bandwidths of all
+ tasks in active state (i.e., ActiveContending or ActiveNonContending);
+
+ - Total bandwidth (this_bw): this is the sum of all tasks "belonging" to the
+ runqueue, including the tasks in Inactive state.
+
+
+ The algorithm reclaims the bandwidth of the tasks in Inactive state.
+ It does so by decrementing the runtime of the executing task Ti at a pace equal
+ to
+
+ dq = -max{ Ui, (1 - Uinact) } dt
+
+ where Uinact is the inactive utilization, computed as (this_bq - running_bw),
+ and Ui is the bandwidth of task Ti.
+
+
+ Let's now see a trivial example of two deadline tasks with runtime equal
+ to 4 and period equal to 8 (i.e., bandwidth equal to 0.5):
+
+ A Task T1
+ |
+ | |
+ | |
+ |-------- |----
+ | | V
+ |---|---|---|---|---|---|---|---|--------->t
+ 0 1 2 3 4 5 6 7 8
+
+
+ A Task T2
+ |
+ | |
+ | |
+ | ------------------------|
+ | | V
+ |---|---|---|---|---|---|---|---|--------->t
+ 0 1 2 3 4 5 6 7 8
+
+
+ A running_bw
+ |
+ 1 ----------------- ------
+ | | |
+ 0.5- -----------------
+ | |
+ |---|---|---|---|---|---|---|---|--------->t
+ 0 1 2 3 4 5 6 7 8
+
+
+ - Time t = 0:
+
+ Both tasks are ready for execution and therefore in ActiveContending state.
+ Suppose Task T1 is the first task to start execution.
+ Since there are no inactive tasks, its runtime is decreased as dq = -1 dt.
+
+ - Time t = 2:
+
+ Suppose that task T1 blocks
+ Task T1 therefore enters the ActiveNonContending state. Since its remaining
+ runtime is equal to 2, its 0-lag time is equal to t = 4.
+ Task T2 start execution, with runtime still decreased as dq = -1 dt since
+ there are no inactive tasks.
+
+ - Time t = 4:
+
+ This is the 0-lag time for Task T1. Since it didn't woken up in the
+ meantime, it enters the Inactive state. Its bandwidth is removed from
+ running_bw.
+ Task T2 continues its execution. However, its runtime is now decreased as
+ dq = - 0.5 dt because Uinact = 0.5.
+ Task T2 therefore reclaims the bandwidth unused by Task T1.
+
+ - Time t = 8:
+
+ Task T1 wakes up. It enters the ActiveContending state again, and the
+ running_bw is incremented.
+
+
3. Scheduling Real-Time Tasks
=============================

@@ -330,6 +489,15 @@ CONTENTS
14 - J. Erickson, U. Devi and S. Baruah. Improved tardiness bounds for
Global EDF. Proceedings of the 22nd Euromicro Conference on
Real-Time Systems, 2010.
+ 15 - G. Lipari, S. Baruah, Greedy reclamation of unused bandwidth in
+ constant-bandwidth servers, 12th IEEE Euromicro Conference on Real-Time
+ Systems, 2000.
+ 16 - L. Abeni, J. Lelli, C. Scordino, L. Palopoli, Greedy CPU reclaiming for
+ SCHED DEADLINE. In Proceedings of the Real-Time Linux Workshop (RTLWS),
+ Dusseldorf, Germany, 2014.
+ 17 - L. Abeni, G. Lipari, A. Parri, Y. Sun, Multicore CPU reclaiming: parallel
+ or sequential?. In Proceedings of the 31st Annual ACM Symposium on Applied
+ Computing, 2016.


4. Bandwidth management
--
2.7.4