Re: [PATCH v26 00/10] Simple Donor Migration for Proxy Execution

From: John Stultz

Date: Fri Apr 03 2026 - 20:26:37 EST


On Fri, Apr 3, 2026 at 8:39 AM K Prateek Nayak <kprateek.nayak@xxxxxxx> wrote:
> On 4/3/2026 8:08 PM, Peter Zijlstra wrote:
> > On Fri, Apr 03, 2026 at 07:13:29PM +0530, K Prateek Nayak wrote:
> >> Hello Peter,
> >>
> >> On 4/3/2026 4:58 PM, Peter Zijlstra wrote:
> >>> On Fri, Apr 03, 2026 at 03:55:22PM +0530, K Prateek Nayak wrote:
> >>>>>> @@ -4256,6 +4277,15 @@ int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
> >>>>>> */
> >>>>>> smp_cond_load_acquire(&p->on_cpu, !VAL);
> >>>>>>
> >>>>>> + /*
> >>>>>> + * We never clear the blocked_on relation on proxy_deactivate.
> >>>>>> + * If we don't clear it here, we have TASK_RUNNING + p->blocked_on
> >>>>>> + * when waking up. Since this is a fully blocked, off CPU task
> >>>>>> + * waking up, it should be safe to clear the blocked_on relation.
> >>>>>> + */
> >>>>>> + if (task_is_blocked(p))
> >>>>>> + clear_task_blocked_on(p, NULL);
> >>>>>> +
> >>>>>
> >>>>> Aah, yes! This is when find_proxy_task() hits deactivate() for us and we
> >>>>> skip ttwu_runnable(). We still need to clear ->blocked_on.
> >>>
> >>> I wonder, should we have proxy_deactivate() do this instead?
> >>
> >> That is one way to tackle that, yes!
> >
> > OK, lets put it there. At that site we already know task_is_blocked()
> > and we get less noise in the wakeup path.
> >
> > Or should we perhaps put it in block_task() itself? The moment you're
> > off the runqueue, ->blocked_on becomes meaningless.
>
> Ack but I'll have to point you to these next bits in John's tree that
> handles sleeping owner
> https://github.com/johnstultz-work/linux-dev/commit/255c9e933edf5b86e29f9fbde67738fc5041a862
>
> Essentially, going further, when the blocked_on chain encounters a
> blocked owner, they'll block themselves and attach onto the sleeping
> owner - when the owner wakes up, the whole chain is activated in one go
> restoring proxy.
>
> This is why John has suggested that block_task() is probably not the
> right place to clear it since, for the sleeping owner bits, we need to
> preserve the blocked_on realation until ttwu().
>
> I have some ideas but let me first see if I can stop them from
> exploding on my system :-)

Phew, you two are hard to keep up with. :) I really wanted to get my
v27 set out last night, but then got derailed by the (not
proxy-related) dl_server issue I was seeing in testing.

Anyway, I'd still like to get it out soon, but now I'd really like to
have the approach here included, so...

I'm currently testing with my best guess of the combined suggestions
you've both tossed into this thread. Unfortunately I still trip over
state == TASK_RUNNING in proxy_deactivate(), so I'm trying to debug
that now.
(I think the issue is we hit the ttwu_queue_wakelist() case without
clearing PROXY_WAKING, so we need to clear_task_blocked_on() earlier
in ttwu, likely right after setting TASK_WAKING - that's looking ok so
far).

After I get this into a stable state, I'll try to polish it up a bit
and then re-layer the rest of the proxy-exec series on top (I do fret
the sleeping owner enqueueing will be more complicated).
-john