Re: [PATCH 1/6] sched/proxy: Remove superfluous clear_task_blocked_in()
From: John Stultz
Date: Thu May 28 2026 - 19:25:58 EST
On Tue, May 26, 2026 at 4:39 PM John Stultz <jstultz@xxxxxxxxxx> wrote:
> On Tue, May 26, 2026 at 4:16 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > Per the discussion here:
> >
> > https://lore.kernel.org/all/20260403112810.GG3738786@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> >
> > The reason for this condition is that the signal condition in
> > try_to_block_task() would set_task_blocked_in_waking(). However, it no longer
> > does that, in fact, that path does clear_task_blocked_on(), rendering the
> > clause under discussion moot.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > ---
> > kernel/sched/core.c | 3 ---
> > 1 file changed, 3 deletions(-)
> >
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -7132,9 +7132,6 @@ static void __sched notrace __schedule(i
> > if (sched_proxy_exec()) {
> > struct task_struct *prev_donor = rq->donor;
> >
> > - if (!prev_state && prev->blocked_on)
> > - clear_task_blocked_on(prev, NULL);
> > -
> > rq_set_donor(rq, next);
> > next->blocked_donor = NULL;
> > if (unlikely(next->is_blocked && next->blocked_on)) {
>
> Oh good! I had a note to try to re-confirm if that chunk was really
> needed, as it did feel a bit like it was patching up a problem after
> the fact.
>
> That said, running this on top of your sched/proxy branch tripped over
> warnings with the ww_mutex selftest, so it looks like there's
> something else missing before this can land.
>
> Digging in, it looks like we still need the fix I had here:
> https://lore.kernel.org/lkml/20260430215103.2978955-3-jstultz@xxxxxxxxxx/
>
> Since without that, we can get into a situation where we have
> blocked_on set when a task __state is TASK_RUNNING. The segment
> you're dropping would catch and clear that out, but really we should
> avoid getting into that situation in the mutex lock code.
Hey Peter,
So I've done testing with your full sched/proxy tree and with the
entire set it looks ok.
However, even with the fix I poined out, I've unfortunately hit races
with the ww_mutex selftest at the point of this patch in the series.
Basically between commit
1b89b7b21bf5 ("sched/proxy: Remove superfluous clear_task_blocked_in()")
and
a8be1edac5a1 ("sched/proxy: Remove PROXY_WAKING")
I'm currently tracing down exactly why the race is cropping up but I
believe the chunk removed in this case is avoiding cases where we end
up getting PROXY_WAKING set on a TASK_RUNNING task.
I'll get back to you when I get my head around it properly, but wanted
to raise the issue in case you or K Prateek can see right through it.
Again, once the PROXY_WAKING code is dropped the race seems to go
away, but I'd like to understand it better so we don't have a
broken-window in the patch series.
thanks
-john