Re: [PATCH v2 1/2] sched: proxy-exec: Close race causing workqueue work being delayed

From: John Stultz

Date: Fri May 01 2026 - 03:11:38 EST


On Thu, Apr 30, 2026 at 11:39 PM K Prateek Nayak <kprateek.nayak@xxxxxxx> wrote:
>
> Mostly cosmetic nitpicks. The overall idea looks good.
>
> On 5/1/2026 3:20 AM, John Stultz wrote:
> > @@ -2183,18 +2183,56 @@ extern int __cond_resched_rwlock_write(rwlock_t *lock) __must_hold(lock);
> > #ifndef CONFIG_PREEMPT_RT
> >
> > /*
> > - * With proxy exec, if a task has been proxy-migrated, it may be a donor
> > - * on a cpu that it can't actually run on. Thus we need a special state
> > - * to denote that the task is being woken, but that it needs to be
> > - * evaluated for return-migration before it is run. So if the task is
> > - * blocked_on PROXY_WAKING, return migrate it before running it.
> > + * The proxy exec blocked_on pointer value uses the low bit as a latch
> > + * value which clarifies if the blocked_on value is used for proxying or
> > + * not.
> > + *
> > + * The state machine looks something like
> > + * NULL -> ptr:unlatched -> ptr:latched -> PROXY_WAKING -> NULL
> > + *
> > + * With some additional transitions:
> > + * ptr:unlatched -> NULL (done on current, or via set_task_blocked_on_waking())
> > + * ptr:latched -> NULL (done only on current)
> > + *
> > + * 1) NULL and ptr:unlatched are effectively equivalent, no proxying will occur
> > + * 2) ptr:latched is the state when proxying will occur
> > + * 3) PROXY_WAKING is used when the task is being woken to ensure we
> > + * return-migrate proxy-migrated tasks before running them (note it has
> > + * the latch bit set).
> > */
> > -#define PROXY_WAKING ((struct mutex *)(-1L))
> > +#define PROXY_BLOCKED_LATCH (1UL)
> > +#define PROXY_BLOCKED_ON_MASK(x) ((struct mutex *)((unsigned long)(x) & ~PROXY_BLOCKED_LATCH))
>
> nit. I think PROXY_BLOCKED_ON_MUTEX() would be a better name since this
> is returning the true mutex pointer back. No strong feelings, I'll defer
> to others for more comments.

So, in the full series we support other lock types as well (rwsems!).
So I'm being a little generic here so it will be usable without major
renaming later.

> > +#define PROXY_WAKING ((struct mutex *)(-1L)) /* PROXY_WAKING has LATCH bit set */
> > +
> > +static inline struct mutex *task_is_blocked_on(struct task_struct *p)
>
> I think this can take the role of task_is_blocked() no?
>

Yes. I have a follow-on patch already queued that switches
task_is_blocked() for task_is_blocked_on().
But I didn't include it here because it is more then a bug fix.

Personally, I've been starting to bristle with task_is_blocked() name
as I think it gets confusing with block_task() and friends, so
something more explicit seems better (though task_is_blocked_on is
only a smidge better).

> Only one caller for try_to_block_task() will require looking at the
> raw blocked_on state but other than that, it is safe for the scheduler
> to move around the preempted task until it has grabbed the BO latch.

Right. In my cleanup I provide a local proxy_should_block() helper for
that usage. I don't directly used blocked_on because with
!CONFIG_SCHED_PROXY_EXEC it would be good to optimize out conditionals
with a constant.

> > +{
> > + if (!sched_proxy_exec())
> > + return false;
> > + return (struct mutex *)((unsigned long)p->blocked_on & PROXY_BLOCKED_LATCH);
> > +}
> > +
> > +static inline void __set_task_blocked_on_latched(struct task_struct *p)
> > +{
>
> Are you planning to reuse this sometime later in the series? If not I
> think we can convert this to "try_set_task_blocked_on_latch()" and return
> false if it finds blocked on having been cleared already.
>
> That way the lock + check in try_to_block_task() can be moved here.

Yep. I did some further cleanups in my current tree this afternoon,
and I already adapted this from your patch set, and got rid of the
unlocked __ version.


> > +static inline struct mutex *__get_task_latched_blocked_on(struct task_struct *p)
>
> I think this can be __get_task_blocked_on() ...
>
> > +{
> > + if (!task_is_blocked_on(p))
> > + return NULL;
> > + if (p->blocked_on == PROXY_WAKING)
> > + return PROXY_WAKING;
> > + return PROXY_BLOCKED_ON_MASK(p->blocked_on);
> > +}
> >
> > static inline struct mutex *__get_task_blocked_on(struct task_struct *p)
>
> ... and this can be __get_task_blocked_on_raw() since only one caller in
> kernel/locking/mutex.h really cares about the ~PROXY_BLOCKED_LATCH
> value outside of this file.
>
> Everything in the sched bits can then simply be __get_task_blocked_on()
> and that seems much cleaner.
>
> Thoughts?

I guess, elsewhere there are a few callers who want the
~PROXY_BLOCKED_LATCH case (both in sched.h and mutex.h). Though you
are right it is mostly for debugging/warnings, so I guess I could go
with your naming suggestion.

> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index da20fb6ea25ae..2f912bf698446 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -6599,8 +6599,13 @@ static bool try_to_block_task(struct rq *rq, struct task_struct *p,
> > * blocked on a mutex, and we want to keep it on the runqueue
> > * to be selectable for proxy-execution.
> > */
> > - if (!should_block)
> > - return false;
> > + if (!should_block) {
> > + guard(raw_spinlock)(&p->blocked_lock);
> > + if (p->blocked_on) {
> > + __set_task_blocked_on_latched(p);
> > + return false;
> > + }
> > + }
>
> In my head, this as:
>
> if (!should_block & try_to_latch_task_blocked_on(p))
> return false;
>
> seems much cleaner. I'll defer to other for comments.

Heh! In my current tree for v3 its:
if (!should_block && set_task_blocked_on_latched(p))
return false;

I'm glad we're thinking the same thing.

Thanks as always for the great feedback! I'll give some time for
others to comment and try to send v3 out later tomorrow and folks can
review it after the weekend.

thanks
-john