[PATCH v29 0/9] Optimized Donor Migration for Proxy Execution

From: John Stultz

Date: Mon May 11 2026 - 22:56:54 EST


Hey All,

This is just the next step for Proxy Execution, allowing
proxy-return-migration to occur from the try_to_wake_up()
callpath. This however introduces some complexity as it means
the rq->donor can change from a try_to_wake_up() call, and thus
there are further changes needed to make this safe.

As always, I’m trying to submit this larger work in smallish
digestible pieces, so in this portion of the series, I’m only
submitting for review and consideration these patches which
minimize the use of passing “prev” (ie: the rq->donor) down
through the class schedulers since this cached value can become
invalid across runqueue lock drops. From there, there are some
fixups to the deadline class scheduler to avoid issues when the
donor changes, then the actual patch to do return migration from
ttwu() followed by the added is_blocked state to avoid the
workqueue stall issue, and some optimization patches so we
migrate the entire task chain in proxy_migrate_task().

Apologies for being sort of slow on recent replies, I was hoping
to get this out last week, but caught whatever is going around
and was out for a bit (but the series did get a few extra days
of test runtime, without issue, so that’s good).

Big thanks to K Prateek and PeterZ for their feedback and
suggestions on the new changes.

New in this iteration:
* Vineeth Pillai diagnosed and reported subtle workqueue stalls
being caused by __schedule(SM_PREEMPT) calls racing with
__schedule(SM_NONE) calls made after the task’s blocked_on is
set, causing tasks to be deactivated without running task
workqueue work. Peter suggested that we use an is_blocked flag
to better track when a task should have been blocked (even if
we don’t block and keep it on the rq for proxy-exec). This
solution requires the ttwu() changes in this series, so it was
included in this set.

* Use scoped_guard() in proxy_needs_return() to avoid calling
block_task() while holding blocked_lock (a Sashiko suggestion)

* Minor cleanup in deadline changes

I’d love to get further feedback on any place where these patches
are confusing, or could use additional clarifications.

Additionally I’d appreciate any testing or comments that folks
have with the full Proxy Execution series! You can find the full
Proxy Exec series here:
https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v29-7.1-rc3/
https://github.com/johnstultz-work/linux-dev.git proxy-exec-v29-7.1-rc3

New changes to the full patch series in this revision include:
* Pulled in Vineeth’s waitqueue stall reproducer

Issues still to address with the full series:
* Really excited to see Andrea’s series taking a stab at better
getting sched_ext and Proxy Execution to play nice with each
other. I hope to try to integrate his series into my full
stack for testing:
https://lore.kernel.org/lkml/20260506174639.535232-1-arighi@xxxxxxxxxx/

* Reevaluate performance regression K Prateek Nayak found with
the full series.

* The chain migration functionality needs further iterations and
better validation to ensure it truly maintains the RT/DL load
balancing invariants (despite this being broken in vanilla
upstream with RT_PUSH_IPI currently)

Future work:
* Expand to more locking primitives: Figuring out pi-futexes
would be good, using proxy for Binder PI is something else
we’re exploring.
* Eventually: Work to replace rt_mutexes and get things happy
with PREEMPT_RT


Credit/Disclaimer:
—--------------------
As always, this Proxy Execution series has a long history with
lots of developers that deserve credit:

First described in a paper[1] by Watkins, Straub, Niehaus, then
from patches from Peter Zijlstra, extended with lots of work by
Juri Lelli, Valentin Schneider, and Connor O'Brien. (and thank
you to Steven Rostedt for providing additional details here!).
Thanks also to Joel Fernandes, Dietmar Eggemann, Metin Kaya,
K Prateek Nayak and Suleiman Souhlal for their substantial
review, suggestion, and patch contributions.

So again, many thanks to those above, as all the credit for this
series really is due to them - while the mistakes are surely mine.

Thanks so much!
-john

[1] https://static.lwn.net/images/conf/rtlws11/papers/proc/p38.pdf

Cc: Joel Fernandes <joelagnelf@xxxxxxxxxx>
Cc: Qais Yousef <qyousef@xxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Ben Segall <bsegall@xxxxxxxxxx>
Cc: Zimuzo Ezeozue <zezeozue@xxxxxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
Cc: Metin Kaya <Metin.Kaya@xxxxxxx>
Cc: Xuewen Yan <xuewen.yan94@xxxxxxxxx>
Cc: K Prateek Nayak <kprateek.nayak@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
Cc: Suleiman Souhlal <suleiman@xxxxxxxxxx>
Cc: kuyo chang <kuyo.chang@xxxxxxxxxxxx>
Cc: hupu <hupu.gm@xxxxxxxxx>
Cc: kernel-team@xxxxxxxxxxx

John Stultz (8):
sched: Rework pick_next_task() and prev_balance() to avoid stale prev
references
sched: deadline: Add some helper variables to cleanup deadline logic
sched: deadline: Add dl_rq->curr pointer to address issues with Proxy
Exec
sched: Rework block_task so it can be directly called
sched: Have try_to_wake_up() handle return-migration for PROXY_WAKING
case
sched: Add is_blocked task flag
sched: Break out core of attach_tasks() helper into sched.h
sched: Migrate whole chain in proxy_migrate_task()

Peter Zijlstra (1):
sched: Add blocked_donor link to task for smarter mutex handoffs

include/linux/sched.h | 10 +-
init/init_task.c | 1 +
kernel/fork.c | 1 +
kernel/locking/mutex.c | 43 +++++-
kernel/sched/core.c | 322 +++++++++++++++++++++------------------
kernel/sched/deadline.c | 46 ++++--
kernel/sched/fair.c | 25 +--
kernel/sched/idle.c | 2 +-
kernel/sched/rt.c | 8 +-
kernel/sched/sched.h | 30 +++-
kernel/sched/stop_task.c | 2 +-
11 files changed, 300 insertions(+), 190 deletions(-)

--
2.54.0.563.g4f69b47b94-goog