[PATCH] sched/proxy_exec: Handle sched_delayed owner in find_proxy_task()

From: soolaugust

Date: Mon Mar 02 2026 - 05:18:44 EST

From: zhidao su <suzhidao@xxxxxxxxxx>

The blocked-owner check at the top of the inner loop unconditionally
lumps two distinct states into one:

1. !on_rq -- the owner has fully left the runqueue; PE cannot
proceed and proxy_deactivate() is the right action.
2. sched_delayed -- EEVDF deferred-dequeue: the owner called schedule()
but was kept physically in the RB-tree because its
lag was still positive (entity_eligible() == true).

Case 2 is transient. The owner will resolve to one of two outcomes:

* A wakeup arrives --> sched_delayed cleared, on_rq stays 1,
owner eligible for PE on the next cycle.
* Dequeue completes --> on_rq drops to 0, caught by case 1 above.

Calling proxy_deactivate() in case 2 is unnecessarily aggressive: it
removes the high-priority donor from the runqueue and clears its
blocked_on, discarding valid PE state for a single missed cycle.

A task that enters the mutex slowpath sets blocked_on before calling
schedule(), and try_to_block_task() is only reached via the explicit
DEQUEUE_DELAYED path -- not the sched_delayed shortcut. Therefore a
sched_delayed owner never has blocked_on set and the chain cannot be
followed further regardless.

Split the check: keep proxy_deactivate() for !on_rq, and switch to
proxy_resched_idle() for sched_delayed. This mirrors the existing
handling of task_on_rq_migrating() owners (see proxy_resched_idle()
call below), which also uses a yield-to-idle to handle a transient
per-owner condition without disturbing the donor.

Signed-off-by: zhidao su <suzhidao@xxxxxxxxxx>
---
kernel/sched/core.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b7f77c165a6..dc9f17b35e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6625,10 +6625,31 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
return p;
}

- if (!READ_ONCE(owner->on_rq) || owner->se.sched_delayed) {
- /* XXX Don't handle blocked owners/delayed dequeue yet */
+ if (!READ_ONCE(owner->on_rq)) {
+ /*
+ * Owner is off the runqueue; proxy execution cannot
+ * proceed through it. Deactivate the donor so it will
+ * be properly re-enqueued when the owner eventually
+ * wakes and releases the mutex.
+ */
return proxy_deactivate(rq, donor);
}
+ if (owner->se.sched_delayed) {
+ /*
+ * The owner is in EEVDF's deferred-dequeue state: it
+ * called schedule() but the scheduler kept it physically
+ * on the runqueue because its lag was still positive.
+ * This is a transient condition -- the owner will either
+ * be woken (clearing sched_delayed) or fully dequeued
+ * (clearing on_rq) very shortly.
+ *
+ * Unlike the !on_rq case the donor is still valid; do
+ * not deactivate it. Yield to idle so the owner can
+ * complete its state transition, then retry PE on the
+ * next scheduling cycle.
+ */
+ return proxy_resched_idle(rq);
+ }

if (task_cpu(owner) != this_cpu) {
/* XXX Don't handle migrations yet */
--
2.43.0