Re: [PATCH 4/6] sched/proxy: Switch proxy to use p->is_blocked
From: John Stultz
Date: Tue May 26 2026 - 15:49:13 EST
On Tue, May 26, 2026 at 7:57 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, May 26, 2026 at 01:16:13PM +0200, Peter Zijlstra wrote:
> > Rather than gate the proxy paths with p->blocked_on, use p->is_blocked.
> >
> > This opens up the state: '->is_blocked && !->blocked_on' for future use.
> >
> > Notably, only proxy and delayed tasks can be ->on_rq && ->is_blocked, and it is
> > guaranteed that sched_class::pick_task() will never return a delayed task.
> > Therefore any task returned from pick_next_task() that has ->is_blocked set,
> > must be a proxy task.
> >
> > XXX: ttwu_runnable(): AFAICT this results in all delayed tasks getting blocked
> > and send down the long wakeup-path -- and while there were some plans there
> > [*], that was especially careful to not take all those locks.
> >
> > [*] https://lore.kernel.org/r/20250702114924.091581796@xxxxxxxxxxxxx
> >
> > Suggested-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > ---
> > kernel/sched/core.c | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3764,7 +3764,7 @@ static inline void proxy_reset_donor(str
> > */
> > static inline bool proxy_needs_return(struct rq *rq, struct task_struct *p)
> > {
> > - if (!task_is_blocked(p))
> > + if (!p->is_blocked)
> > return false;
>
> Oh, I think we can solve things if we have a cpus_allowed check here. If
> the task is on an allowed CPU, it don't need migration and we can carry
> on without eating the overhead.
I need to look more at your patches here, but I had a similar shortcut
awhile back in early versions of the series, and I dropped it because
it was pointed out that on big-little systems, you might have a
important task on the big that proxy-migrates to a little to get a
lock owned by a background task quickly released. But when the owner
wakes up the donor, if it wakes it on the little rq's, then it may be
a bit until the important task gets re-balanced to the big, impacting
performance. Instead it seemed better to match the non-proxy behavior
where when the task is proxy-migrated its the same as if its off the
rq blocking, and thus on wakeup we'd want to do the full placement
(hopefully back to the big, but at least wherever select_task_rq()
chooses).
That way proxy-migrations won't disrupt scheduling behaior very much.
thanks
-john