Re: [PATCH 1/2] sched_ext: Fix ops.dequeue() semantics

From: Tejun Heo
Date: Sun Dec 28 2025 - 19:06:28 EST


Sorry about the million replies. Pretty squirrel brained right now.

On Fri, Dec 19, 2025 at 11:43:14PM +0100, Andrea Righi wrote:
> @@ -1390,6 +1390,9 @@ static void do_enqueue_task(struct rq *rq, struct task_struct *p, u64 enq_flags,
> WARN_ON_ONCE(atomic_long_read(&p->scx.ops_state) != SCX_OPSS_NONE);
> atomic_long_set(&p->scx.ops_state, SCX_OPSS_QUEUEING | qseq);
>
> + /* Mark that ops.enqueue() is being called for this task */
> + p->scx.flags |= SCX_TASK_OPS_ENQUEUED;

Is this guaranteed to be cleared after dispatch? ops_dequeue() is called
from dequeue_task_scx() and set_next_task_scx(). It looks like the call from
set_next_task_scx() may end up calling ops.dequeue() when the task starts
running, this seems mostly accidental.

- The BPF sched probably expects ops.dequeue() call immediately after
dispatch rather than on the running transition. e.g. imagine a scenario
where a BPF sched dispatches multiple tasks to a local DSQ. Wouldn't the
expectation be that ops.dequeue() is called as soon as a task is
dispatched into a local DSQ?

- If this depends on the ops_dequeue() call from set_next_task_scx(), it'd
also be using the wrong DEQ flag - SCX_DEQ_CORE_SCHED_EXEC - for regular
ops.dequeue() following a dispatch. That call there is that way only
because ops_dequeue() didn't do anything when OPSS_NONE.

Thanks.

--
tejun