Re: [PATCH 1/2] sched_ext: Fix ops.dequeue() semantics

From: Tejun Heo

Date: Sat Feb 14 2026 - 12:56:21 EST

Hello, Andrea.

On Sat, Feb 14, 2026 at 11:16:34AM +0100, Andrea Righi wrote:
> I ran more tests and I don't think we can simply rely on p->scx.sticky_cpu.
>
> In particular, I don't see how to handle this scenario using only
> p->scx.sticky_cpu: a task starts an internal migration, a sched_change
> occurs, and ops.dequeue() gets skipped because p->scx.sticky_cpu >= 0.

Oh, that shouldn't happen, so move_remote_task_to_local_dsq() does the
following:

deactivate_task(src_rq, p, 0);
set_task_cpu(p, cpu_of(dst_rq));
p->scx.sticky_cpu = cpu_of(dst_rq);

raw_spin_rq_unlock(src_rq);
raw_spin_rq_lock(dst_rq);
...
activate_task(dst_rq, p, 0);

It *looks* like something get can get while the locks are switched; however,
the above deactivate_task() does WRITE_ONCE(p->on_rq, TASK_ON_RQ_MIGRATING)
and task_rq_lock() does the following:

for (;;) {
raw_spin_lock_irqsave(&p->pi_lock, rf->flags);
rq = task_rq(p);
raw_spin_rq_lock(rq);
/*
* move_queued_task() task_rq_lock()
*
* ACQUIRE (rq->lock)
* [S] ->on_rq = MIGRATING [L] rq = task_rq()
* WMB (__set_task_cpu()) ACQUIRE (rq->lock);
* [S] ->cpu = new_cpu [L] task_rq()
* [L] ->on_rq
* RELEASE (rq->lock)
*
* If we observe the old CPU in task_rq_lock(), the acquire of
* the old rq->lock will fully serialize against the stores.
*
* If we observe the new CPU in task_rq_lock(), the address
* dependency headed by '[L] rq = task_rq()' and the acquire
* will pair with the WMB to ensure we then also see migrating.
*/
if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
rq_pin_lock(rq, rf);
return rq;
}
raw_spin_rq_unlock(rq);
raw_spin_unlock_irqrestore(&p->pi_lock, rf->flags);

while (unlikely(task_on_rq_migrating(p)))
cpu_relax();
}

ie. TASK_ON_RQ_MIGRATING works like a separate lock that protects the task
while it's switching the RQs, so any operations that use task_rq_lock()
which includes any property changes can't get inbetween.

Thanks.

--
tejun