Re: [PATCH v2 2/2] sched/core: Don't proxy-exec unmatched cookie lock owners
From: John Stultz
Date: Tue May 12 2026 - 18:17:09 EST
On Thu, May 7, 2026 at 3:42 AM Vasily Gorbik <gor@xxxxxxxxxxxxx> wrote:
>
> Core scheduling chooses a core-wide cookie before __schedule()
> installs the next task. With proxy-exec enabled, that task becomes the
> donor/scheduling context, and find_proxy_task() may then replace the
> execution context with the runnable mutex owner. If its cookie differs
> from the selected core cookie, running it would bypass core scheduling's
> cookie selection.
>
> When the final mutex owner found by find_proxy_task() does not match the
> selected core cookie, stop proxying the donor. If the current execution
> context is already in the blocked chain, fall back to idle like the
> existing proxy-exec retry paths do. Otherwise deactivate the donor and
> let __schedule() pick again. The mutex owner can be picked later under
> its own cookie.
>
> Fixes: 7de9d4f94638 ("sched: Start blocked_on chain processing in find_proxy_task()")
> Reported-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> Signed-off-by: Vasily Gorbik <gor@xxxxxxxxxxxxx>
> ---
> kernel/sched/core.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 8aed55592ca9..d338fb714ce8 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6960,6 +6960,12 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
> */
> }
> WARN_ON_ONCE(owner && !owner->on_rq);
> +
> + if (owner && !sched_cpu_cookie_match(rq, owner)) {
> + if (curr_in_chain)
> + return proxy_resched_idle(rq);
> + goto deactivate;
> + }
Hrm. This is less pretty.
My previous (admittedly shallow) thinking on the core-scheduler was
that it wouldn't be an issue for proxy because the donor wasn't going
to actually run on the cpu, so whatever isolation is done on the core,
the donor migration wouldn't be a problem.
But I'm seeing now the donor won't be *chosen* until it has the right
core_cookie, and then that may be different from the owners cookie.
It seems like ideally we want the donor's effective cookie to be the
same as the runnable-owner's in the chain. The downside to this is
you have to walk the blocked_on chain to evaluate this, and the whole
core_tree rbtree sorts by cookie, so its not trivial to rework
selection this way. And since the runnable-owner of the chain-tree
changes over time, we can't just set the inherited cookie when we set
blocked_on.
So I will need to think a bit more on this.
In the short term, I think you're change is probably ok since it makes
sure we don't run tasks with the wrong cookie, but it effectively
stops proxying from having a beneficial effect.
Thanks again so much for raising this issue (along with K Prateek)!
thanks
-john