Re: [PATCH 1/6] sched/proxy: Remove superfluous clear_task_blocked_in()

From: K Prateek Nayak

Date: Fri May 29 2026 - 02:45:58 EST


Hello John,

On 5/29/2026 4:50 AM, John Stultz wrote:
> However, even with the fix I poined out, I've unfortunately hit races
> with the ww_mutex selftest at the point of this patch in the series.
> Basically between commit
> 1b89b7b21bf5 ("sched/proxy: Remove superfluous clear_task_blocked_in()")
> and
> a8be1edac5a1 ("sched/proxy: Remove PROXY_WAKING")
>
> I'm currently tracing down exactly why the race is cropping up but I
> believe the chunk removed in this case is avoiding cases where we end
> up getting PROXY_WAKING set on a TASK_RUNNING task.

This seems to be the failure path:

/* Task p*/
mutex_lock(mutex)
... try_to_wake_up(p)
schedule_preempt_disabled() ttwu_runnable()
__schedule() __task_rq_lock() /* Wins */
rq_lock() /* Waits */ if (task_on_rq_queued(p))
/*
* p->is_blocked is still not set!
* proxy_needs_return() bails out early.
*/
ttwu_do_wakeup()
p->__state = TASK_RUNNING;
__tsk_rq_unlock();
...
/* p->__state = TASK_RUNNING */
prev_state = p->__state;
if (prev_state && ...) {
/*
* Skipped since task is
* already TASK_RUNNING
*/
}

/* p->is_blocked = 0; p->blocked_on = PROXY_WAKING */
next = p;

/* Returns from schedule_preempt_disabled()
set_task_blocked_on(p, mutex)

!!! p->blocked_on == PROXY_WAKING && p->blocked_on != mutex !!!
---

Also proxy_needs_return() bails out too early - a wakeup from signal
should still clear p->blocked_on even if p->wake_cpu is same as
task_cpu().


I think we need the following at commit 83f9b04ef50c ("sched/proxy:
Switch proxy to use p->is_blocked") to clear p->blocked_on in the wakeup
path irrespective of p->is_blocked:

(Lightly tested)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a125e65c35bb..fe903976fd09 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3764,28 +3764,28 @@ static inline void proxy_reset_donor(struct rq *rq)
*/
static inline bool proxy_needs_return(struct rq *rq, struct task_struct *p)
{
- if (!p->is_blocked)
- return false;
-
- /*
- * Typically per __set_task_cpu(), task_cpu(p) == p->wake_cpu.
- *
- * However, proxy_set_task_cpu() is such that it preserves the
- * original cpu in p->wake_cpu while migrating p for proxy reasons
- * (possibly outside of the allowed p->cpus_ptr).
- *
- * Furthermore, migration_cpu_stop() / __migrate_swap_task(), will
- * only set p->wake_cpu when !p->on_rq, and since here p->on_rq, this
- * will not apply. But if it did, this check is the safe way around
- * and would migrate.
- */
- if (task_cpu(p) == p->wake_cpu)
+ if (!task_is_blocked(p))
return false;

scoped_guard(raw_spinlock, &p->blocked_lock) {
/* Task is waking up; clear any blocked_on relationship */
__clear_task_blocked_on(p, NULL);

+ /*
+ * Typically per __set_task_cpu(), task_cpu(p) == p->wake_cpu.
+ *
+ * However, proxy_set_task_cpu() is such that it preserves the
+ * original cpu in p->wake_cpu while migrating p for proxy reasons
+ * (possibly outside of the allowed p->cpus_ptr).
+ *
+ * Furthermore, migration_cpu_stop() / __migrate_swap_task(), will
+ * only set p->wake_cpu when !p->on_rq, and since here p->on_rq, this
+ * will not apply. But if it did, this check is the safe way around
+ * and would migrate.
+ */
+ if (task_cpu(p) == p->wake_cpu)
+ return false;
+
/* If already current, don't need to return migrate */
if (task_current(rq, p))
return false;
---

Part of that belongs in commit e2ff8b7bde07 ("sched/proxy: Only return
migrate when needed") and the first hunk of 83f9b04ef50c ("sched/proxy:
Switch proxy to use p->is_blocked") in proxy_needs_return() should be
dropped.

--
Thanks and Regards,
Prateek