Re: [PATCH 1/2] sched_ext: Fix ops.dequeue() semantics

Next message: Eugenio Perez Martin: "Re: [External] Re: [PATCH] vduse: Fix msg list race in vduse_dev_read_iter"
Previous message: Lukasz Luba: "Re: [PATCH] arm64: dts: qcom: sm8550: Update EAS properties"
In reply to: Andrea Righi: "Re: [PATCH 1/2] sched_ext: Fix ops.dequeue() semantics"
Next in thread: Christian Loehle: "Re: [PATCH 1/2] sched_ext: Fix ops.dequeue() semantics"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Andrea Righi

Date: Mon Feb 02 2026 - 04:29:23 EST

On Mon, Feb 02, 2026 at 08:45:18AM +0100, Andrea Righi wrote:
...
> > So I have finally gotten around updating scx_storm to the new semantics,
> > see:
> > https://github.com/cloehle/scx/tree/cloehle/scx-storm-qmap-insert-local-dequeue-semantics
> >
> > I don't think the new ops.dequeue() are enough to make inserts to local-on
> > from anywhere safe, because it's still racing with dequeue from another CPU?
>
> Yeah, with this patch set BPF schedulers get proper ops.dequeue()
> callbacks, but we're not fixing the usage of SCX_DSQ_LOCAL_ON from
> ops.dispatch().
>
> When task properties change between scx_bpf_dsq_insert() and the actual
> dispatch, task_can_run_on_remote_rq() can still trigger a fatal
> scx_error().
>
> The ops.dequeue(SCX_DEQ_SCHED_CHANGE) notifications happens after the
> property change, so it can't prevent already-queued dispatches from
> failing. The race window is between ops.dispatch() returning and
> dispatch_to_local_dsq() executing.
>
> We can address this in a separate patch set. One thing at a time. :)

Thinking more on this, the problem is that we're passing enforce=true to
task_can_run_on_remote_rq(), triggering a critical failure - scx_error().
There's a logic in task_can_run_on_remote_rq() to fallback to the global
DSQ, that doesn't happen if we pass enforce=true, due to scx_error().

However, instead of the global DSQ fallback, I was wondering if it'd be
better to simply re-enqueue the task - setting SCX_ENQ_REENQ - if the
target local DSQ isn't valid anymore when the dispatch is finalized.

In this way using SCX_DSQ_LOCAL_ON | cpu from ops.dispatch() would simply
trigger a re-enqueue when "cpu" isn't valid anymore (due to concurrent
affinity / migration disabled changes) and the BPF scheduler can handle
that in another ops.enqueue().

What do you think?

Thanks,
-Andrea