Re: [PATCH 4/6] sched/proxy: Switch proxy to use p->is_blocked

From: Peter Zijlstra

Date: Wed May 27 2026 - 04:32:07 EST


On Tue, May 26, 2026 at 07:25:13PM -0700, John Stultz wrote:
> On Tue, May 26, 2026 at 4:16 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > Rather than gate the proxy paths with p->blocked_on, use p->is_blocked.
> >
> > This opens up the state: '->is_blocked && !->blocked_on' for future use.
> >
> > Notably, only proxy and delayed tasks can be ->on_rq && ->is_blocked, and it is
> > guaranteed that sched_class::pick_task() will never return a delayed task.
> > Therefore any task returned from pick_next_task() that has ->is_blocked set,
> > must be a proxy task.
>
> While this seems true, it also feels very subtle.
>
> Just taking a step back, while it might be possible, I'm not sure I'm
> totally seeing the benefit of doing this.
>
> When we were playing around with the idea of keeping ptr+latch-bit in
> the blocked_on field, using NULL+latch to replace PROXY_WAKING made
> sense, but with is_blocked being used for more than just proxy logic,
> I'm not sure encoding meaning across the two fields is particularly
> intuitive (and def seems more error prone). Is the special
> PROXY_WAKING value really so awful? Or maybe does it make sense to
> have different values for is_blocked (DELAYED, PROXY) so we can better
> separate the variants when combining with blocked_on?
>
> It is a little funny to see how close this is getting to the separate
> blocked_on_state + blocked_on management I had way back when before we
> compressed that down with PROXY_WAKING. :)

Yeah, I was thinking the same. But then last night, after it cooled down
a bit, my brain started working again and I realized that there is a
simple test that should work.

Basically, *IF* we are proxy migrated -- and thus need a return
migration -- then task_cpu(p) != p->wake_cpu, per proxy_set_task_cpu().

This doesn't suffer the random migration issues you get from purely
checking against p->cpus_ptr, and it is more specific than PROXY_WAKING,
in that it will really only do the long path / migration if we do in
fact need return migration. If we stayed on the right CPU, we simply
stay there.

So I've stuck the below into the series between 3 and 4. This seems to
survive boot with ww_mutex selftest and hackbench.


---
kernel/sched/core.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3767,6 +3767,21 @@ static inline bool proxy_needs_return(st
if (!task_is_blocked(p))
return false;

+ /*
+ * Typically per __set_task_cpu(), task_cpu(p) == p->wake_cpu.
+ *
+ * However, proxy_set_task_cpu() is such that it preserves the
+ * original cpu in p->wake_cpu while migrating p for proxy reasons
+ * (possibly outside of the allowed p->cpus_ptr).
+ *
+ * Furthermore, migration_cpu_stop() / __migrate_swap_task(), will
+ * only set p->wake_cpu when !p->on_rq, and since here p->on_rq, this
+ * will not apply. But if it did, this check is the safe way around
+ * and would migrate.
+ */
+ if (task_cpu(p) == p->wake_cpu)
+ return false;
+
scoped_guard(raw_spinlock, &p->blocked_lock) {
/* Task is waking up; clear any blocked_on relationship */
__clear_task_blocked_on(p, NULL);