Re: [PATCH 1/1] docs: scheduler: Start documenting the EEVDF scheduler

From: Carlos Bilbao
Date: Sat Jul 13 2024 - 10:00:57 EST


Hello,

On 7/12/24 23:22, Randy Dunlap wrote:
> Hi,
>
> On 7/12/24 5:32 PM, Carlos Bilbao wrote:
>> Add some documentation regarding the newly introduced scheduler EEVDF.
>>
>> Signed-off-by: Carlos Bilbao <carlos.bilbao.osdev@xxxxxxxxx>
>> ---
>> Documentation/scheduler/index.rst | 1 +
>> Documentation/scheduler/sched-design-CFS.rst | 10 +++--
>> Documentation/scheduler/sched-eevdf.rst | 44 ++++++++++++++++++++
>> 3 files changed, 51 insertions(+), 4 deletions(-)
>> create mode 100644 Documentation/scheduler/sched-eevdf.rst
>>
>> diff --git a/Documentation/scheduler/index.rst b/Documentation/scheduler/index.rst
>> index 43bd8a145b7a..444a6fef1464 100644
>> --- a/Documentation/scheduler/index.rst
>> +++ b/Documentation/scheduler/index.rst
>> @@ -11,6 +11,7 @@ Scheduler
>> sched-arch
>> sched-bwc
>> sched-deadline
>> + sched-eevdf
>
> I would have probably put EEVDF just after CFS instead of before it...
> whatever.
>
>> sched-design-CFS
>> sched-domains
>> sched-capacity
>> diff --git a/Documentation/scheduler/sched-design-CFS.rst b/Documentation/scheduler/sched-design-CFS.rst
>> index bc1e507269c6..b703c6dcb3cd 100644
>> --- a/Documentation/scheduler/sched-design-CFS.rst
>> +++ b/Documentation/scheduler/sched-design-CFS.rst
>> @@ -8,10 +8,12 @@ CFS Scheduler
>> 1. OVERVIEW
>> ============
>>
>> -CFS stands for "Completely Fair Scheduler," and is the new "desktop" process
>> -scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. It is the
>> -replacement for the previous vanilla scheduler's SCHED_OTHER interactivity
>> -code.
>> +CFS stands for "Completely Fair Scheduler," and is the "desktop" process
>> +scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. When
>> +originally merged, it was the replacement for the previous vanilla
>> +scheduler's SCHED_OTHER interactivity code. Nowadays, CFS is making room
>> +for EEVDF, for which documentation can be found in
>> +:ref:`sched_design_EEVDF`.
>>
>> 80% of CFS's design can be summed up in a single sentence: CFS basically models
>> an "ideal, precise multi-tasking CPU" on real hardware.
>> diff --git a/Documentation/scheduler/sched-eevdf.rst b/Documentation/scheduler/sched-eevdf.rst
>> new file mode 100644
>> index 000000000000..31ad8f995360
>> --- /dev/null
>> +++ b/Documentation/scheduler/sched-eevdf.rst
>> @@ -0,0 +1,44 @@
>> +.. _sched_design_EEVDF:
>> +
>> +===============
>> +EEVDF Scheduler
>> +===============
>> +
>> +The "Earliest Eligible Virtual Deadline First" (EEVDF) was first introduced
>> +in a scientific publication in 1995 [1]. The Linux kernel began
>> +transitioning to EEVDF in version 6.6 (as a new option in 2024), moving
>> +away from the earlier Completely Fair Scheduler (CFS) in favor of a version
>> +of EEVDF proposed by Peter Zijlstra in 2023 [2-4]. More information
>> +regarding CFS can be found in :ref:`sched_design_CFS`.
>> +
>> +Similarly to CFS, EEVDF aims to distribute CPU time equally among all
>> +runnable tasks with the same priority. To do so, it assigns a virtual run
>> +time to each task, creating a "lag" value that can be used to determine
>> +whether a task has received its fair share of CPU time. In this way, a task
>> +with a positive lag is owed CPU time, while a negative lag means the task
>> +has exceeded its portion. EEVDF picks tasks with lag greater or equal to
>> +zero and calculates a virtual deadline (VD) for each, selecting the task
>> +with the earliest VD to execute next. It's important to note that this
>> +allows latency-sensitive tasks with shorter time slices to be prioritized,
>> +which helps with their responsiveness.
>> +
>> +There are ongoing discussions on how to manage lag, especially for sleeping
>> +tasks; but at the time of writing EEVDF uses a "decaying" mechanism based
>> +on virtual run time (VRT). This prevents tasks from exploiting the system
>> +by sleeping briefly to reset their negative lag: when a task sleeps, it
>> +remains on the run queue but marked for "deferred dequeue," allowing its
>> +lag to decay over VRT. Hence, long-sleeping tasks eventually have their lag
>> +reset. Finally, tasks can preempt others if their VD is earlier, and tasks
>> +can request specific time slices using the new sched_setattr() system call,
>> +which further facilitates the job of latency-sensitive applications.
>> +
>> +4. REFERENCES
>> +=============
>
> Why is this section numbered 4?
> No other sections here are numbered.
>
>> +
>> +[1] https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=805acf7726282721504c8f00575d91ebfd750564
>> +
>> +[2] https://lore.kernel.org/lkml/a79014e6-ea83-b316-1e12-2ae056bda6fa@xxxxxxxxxxxxxxxxxx/
>> +
>> +[3] https://lwn.net/Articles/969062/
>> +
>> +[4] https://lwn.net/Articles/925371/
>
> Other than those 2 comments:
>
> Reviewed-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
> Tested-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>

Thank you for reviewing and providing feedback, Randy. I'm sending v2 now.

>
>
> Thanks.
>

Thanks,
Carlos