Re: [RFC PATCH 2/3] docs: scheduler: Add scheduler overview documentation

From: Valentin Schneider
Date: Wed Apr 01 2020 - 11:03:42 EST



On 01/04/20 11:00, John Mathew wrote:
> +**Schedule class:** It is an extensible hierarchy of scheduler modules. The
> +modules encapsulate scheduling policy details.
> +They are called from the core code which is independent. Scheduling classes are
> +implemented through the sched_class structure.
> +fair_sched_class and rt_sched_class class are implementations of this class. The
> +main members of the :c:type:`struct sched_class <sched_class>` are :
> +
> +For the fair_sched_class the hooks (implemented as <function name>_fair)
> +does the following:
> +
> +:c:member:`enqueue_task`
> + Update the fair scheduling stats and puts scheduling entity in
> + to rb tree and increments the nr_running variable.
> +
> +:c:member:`dequeue_task`
> + Moves the entity out of the rb tree when entity no longer runnable
> + and decrements the nr_running variable. Also update the fair scheduling stats.
> +
> +:c:member:`yield_task`
> + Use the buddy mechanism to skip onto the next highest priority se at
> + every level in the CFS tree, unless doing so would introduce gross unfairness
> + in CPU time distribution.
> +
> +:c:member:`check_preempt_curr`
> + Check whether the task that woke up should pre-empt the
> + running task.
> +
> +:c:member:`pick_next_task`
> + Pick the next eligible task. This may not be the left most task
> + in the rbtree. Instead a buddy system is used which provides benefits of
> + cache locality and group scheduling.
> +
> +:c:member:`task_tick`
> + Called from scheduler_tick(). Updates the runtime statistics of the
> + currently running task and checks if this task needs to be pre-empted.
> +
> +:c:member:`task_fork`
> + scheduler setup for newly forked task.
> +
> +:c:member:`task_dead`
> + A task struct has one reference for the use as "current". If a task
> + dies, then it sets TASK_DEAD in tsk->state and calls schedule one last time.
> + The schedule call will never return, and the scheduled task must drop that
> + reference.
> +

I tend to agree with Matthew in that this is too much info on the current
implem. What would be useful however is some sort of documentation for the
sched_class fields themselves; as you say those are (mainly) called from
core code, so IMO what's interesting is when/why the core code calls them.

For instance highlighting the "change" cycle would be a good start, see
e.g. do_set_cpus_allowed() and what it does with {en,de}queue_task() &
{set_next,put_prev}_task().