Re: [PATCH sched_ext/for-7.1] sched_ext: Documentation: Add missing calls to quiescent(), runnable()

From: Andrea Righi

Date: Wed Apr 08 2026 - 09:49:56 EST


On Wed, Apr 08, 2026 at 12:40:09PM +0000, Kuba Piecuch wrote:
> Hi Andrea,
>
> On Wed Apr 8, 2026 at 11:28 AM UTC, Andrea Righi wrote:
> ...
> >
> > Looks good, but I noticed another issue, should we also change the condition up
> > above as following?
> >
> > Documentation/scheduler/sched-ext.rst | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Documentation/scheduler/sched-ext.rst b/Documentation/scheduler/sched-ext.rst
> > index 29d36e248f58b..99df4cc982375 100644
> > --- a/Documentation/scheduler/sched-ext.rst
> > +++ b/Documentation/scheduler/sched-ext.rst
> > @@ -423,7 +423,7 @@ by a sched_ext scheduler:
> > ops.runnable(); /* Task becomes ready to run */
> >
> > while (task_is_runnable(task)) {
> > - if (task is not in a DSQ && task->scx.slice == 0) {
> > + if (task is not in a DSQ || task->scx.slice == 0) {
> > ops.enqueue(); /* Task can be added to a DSQ */
> >
> > /* Task property change (i.e., affinity, nice, etc.)? */
> >
> > Because we trigger ops.enqueue() when the task expired its time slice or it
> > becomes runnable and has not been added to a DSQ.
> >
> > This also represents correctly the sched_change() scenario: a task being
> > re-enqueued after sched_change() still has its time slice > 0, but we need to
> > call ops.enqueue() for it.
>
> I agree that the condition should be changed, but I'm not sure that this is
> what it should look like.
>
> Is the "task is not in a DSQ" part of the condition there to handle direct
> dispatch? Apart from direct dispatch from ops.select_cpu(), I wasn't able to
> come up with a situation where we would reach this condition with the task
> present on some DSQ.

The intent is to represent the direct dispatch from ops.select_cpu(), since in
that case ops.enqueue() is skipped.

Honestly I think if we change the && to || in that condition, everything should
be pretty accurate.

>
> A more general comment about the pseudocode: I think it can be useful to
> introduce someone new to the general flow of the callbacks in sched_ext,
> but the documentation should be clear that this is a simplified view that
> makes assumptions about the behavior of the BPF scheduler itself (flags like
> SCX_OPS_ENQ_LAST, whether the scheduler uses direct dispatch), as well as
> the overall system (Can sched_ext be preempted by a higher-priority sched
> class? Can scheduling properties of a task be changed while it's running?)
> Without stating these assumptions clearly, we risk leaving the reader falsely
> believing they have a complete understanding.

Of course this schema is not a complete representation of the entire sched_ext
state machine, if we put everything it'd become too big and complex. I think we
should just cover the most common use cases here. Maybe we can clarify this in
the description before this diagram.

Thanks,
-Andrea