Re: [PATCH] sched_ext: Fix spurious WARN on stale ops_state in ops_dequeue()

From: Andrea Righi

Date: Wed May 13 2026 - 12:49:45 EST


On Wed, May 13, 2026 at 06:41:26PM +0200, Samuele Mariotti wrote:
...
> Thanks for the suggestion. I agree with adding cpu_relax() and the
> retry limit to preserve the original WARN_ON_ONCE() as a safety net
> for real bugs.
>
> Given the improvements to efficiency, I would also improve the non-atomic
> read of p->scx.flags by using READ_ONCE(), preventing the compiler from
> caching the value across retries and ensuring each iteration observes the
> latest value written by the concurrent finish_dispatch(). I would also
> lower the retry limit from 128 to 4, since the maximum number of retries
> observed empirically is 1, so 4 gives a reasonable safety margin without
> spinning unnecessarily long. 
>
> Something like this: 
>
> if (!(READ_ONCE(p->scx.flags) & SCX_TASK_IN_CUSTODY) &&
>      !WARN_ON_ONCE(retries++ >= 4)) {
>          cpu_relax();
>          goto retry;
> }
>
> Let me know if this looks good to you.

Yeah, that sounds reasonable to me.

Thanks,
-Andrea