Re: [PATCH v29 1/9] sched: Rework pick_next_task() and prev_balance() to avoid stale prev references

From: John Stultz

Date: Tue May 19 2026 - 22:45:41 EST


On Tue, May 19, 2026 at 3:32 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, May 12, 2026 at 02:56:11AM +0000, John Stultz wrote:
> > Historically, the prev value from __schedule() was the rq->curr.
> > This prev value is passed down through numerous functions, and
> > used in the class scheduler implementations. The fact that
> > prev was on_cpu until the end of __schedule(), meant it was
> > stable across the rq lock drops that the class->pick_next_task()
> > and ->balance() implementations often do.
> >
> > However, with proxy-exec, the prev passed to functions called
> > by __schedule() is rq->donor, which may not be the same as
> > rq->curr and may not be on_cpu, this makes the prev value
> > potentially unstable across rq lock drops.
> >
> > A recently found issue with proxy-exec, is when we begin doing
> > return migration from try_to_wake_up(), its possible we may be
> > waking up the rq->donor. When we do this, we proxy_resched_idle()
> > to put_prev_set_next() setting the rq->donor to rq->idle, allowing
> > the rq->donor to be return migrated and allowed to run.
> >
> > This however runs into trouble, as on another cpu we might be in
> > the middle of calling __schedule(). Conceptually the rq lock is
> > held for the majority of the time, but in calling pick_next_task()
> > its possible the class->pick_next_task() handler or the
> > ->balance() call may briefly drop the rq lock. This opens a
> > window for try_to_wake_up() to wake and return migrate the
> > rq->donor before the class logic reacquires the rq lock.
> >
> > Unfortunately pick_next_task() and prev_balance() pass in a prev
> > argument, to which we pass rq->donor. However this prev value can
> > now become stale and incorrect across a rq lock drop.
> >
> > So, to correct this, rework the pick_next_task() and
> > prev_balance() calls so that they do not take a "prev" argument.
> >
> > Also rework the class ->pick_next_task() and ->balance()
> > implementations to drop the prev argument, and in the cases
> > where it was used, and have the class functions reference
> > rq->donor directly, and not save the value across rq lock drops
> > so that we don't end up with a stale references.
> >
> > Signed-off-by: John Stultz <jstultz@xxxxxxxxxx>
>
> Right, so I think I'll take patches 1,2,3,8,9 from the flat series and
> munge this on top. That gets rid of all that pick_next_task() nonsense
> and simplifies things a little.

Sounds good to me! Let me know when you have a branch posted that I
should use as a base for any further iterations on the rest of the
proxy chagnes I submitted.

thanks
-john