Re: [PATCH] sched/proxy_exec: Optimize proxy_tag_curr() pushable removal

From: K Prateek Nayak

Date: Tue Mar 03 2026 - 10:45:35 EST


Hello Zhidao,

On 3/3/2026 5:27 PM, soolaugust@xxxxxxxxx wrote:
> The SAVE/RESTORE cycle achieves pushable removal only indirectly:
> enqueue_task_rt/dl() suppresses re-enqueue into the pushable
> list when task_is_blocked(owner) is true. The same result is
> obtained more directly by calling dequeue_pushable_task() or
> dequeue_pushable_dl_task() once, without any of the side effects.
>
> Replace the workaround with per-class direct calls:

I had this question in the past and from my reading, I think we
can safely call the dequeue_pushable*() helper (John might have some
notes on these bits) but that direct call is just horrible!

Why can't we just have a sched_class->proxy_tag_curr(rq, p) or such
and just call it if a "sched_class" populates it?

>
> RT: dequeue_pushable_task(rq, owner) -- O(1) plist remove
> DL: dequeue_pushable_dl_task(rq, owner) -- O(log n) rb_erase,
> but avoids the bandwidth counter churn entirely
> CFS: no-op (no pushable list; task_is_blocked() suffices)
>
> Both functions are promoted from static and declared in sched.h.
> deadline.c also gains the missing isolation.h include required
> by dl_get_task_effective_cpus().
>
> Signed-off-by: zhidao su <suzhidao@xxxxxxxxxx>
> ---
> kernel/sched/core.c | 28 +++++++++++++++++++---------
> kernel/sched/deadline.c | 3 ++-
> kernel/sched/rt.c | 2 +-
> kernel/sched/sched.h | 2 ++
> 4 files changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index dc9f17b35e4..2aba15d84b7 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6728,16 +6728,26 @@ static inline void proxy_tag_curr(struct rq *rq, struct task_struct *owner)
> if (!sched_proxy_exec())
> return;
> /*
> - * pick_next_task() calls set_next_task() on the chosen task
> - * at some point, which ensures it is not push/pullable.
> - * However, the chosen/donor task *and* the mutex owner form an
> - * atomic pair wrt push/pull.
> + * The donor goes through set_next_task() which calls
> + * dequeue_pushable_task() making it non-pushable. The owner
> + * does not go through that path, so we must remove it from
> + * the pushable list explicitly.
> *
> - * Make sure owner we run is not pushable. Unfortunately we can
> - * only deal with that by means of a dequeue/enqueue cycle. :-/
> - */
> - dequeue_task(rq, owner, DEQUEUE_NOCLOCK | DEQUEUE_SAVE);
> - enqueue_task(rq, owner, ENQUEUE_NOCLOCK | ENQUEUE_RESTORE);
> + * For RT tasks: remove from the plist directly.
> + * For DL tasks: remove from the rb-tree directly.
> + * For CFS tasks: no pushable list exists; can_migrate_task()
> + * already rejects blocked owners via task_is_blocked().

Tag is for the current running context (owwner). task_on_cpu() check is
what guards that for fair.

> + *
> + * The prior dequeue/enqueue(SAVE/RESTORE) cycle achieved the
> + * same result by relying on task_is_blocked() suppressing the
> + * re-enqueue into the pushable list, but it carried O(log n)
> + * overhead and, for DL owners, triggered sub_running_bw() +
> + * sub_rq_bw() -- bandwidth counter churn with no net effect.
> + */
> + if (rt_task(owner))
> + dequeue_pushable_task(rq, owner);
> + else if (dl_task(owner))
> + dequeue_pushable_dl_task(rq, owner);

I think a sched_class callback would be better and it'll also allow
other sched classes to just populate it when need arises instead of
adding more else if here. Like:

if (p->sched_class->proxy_tag_curr) /* or dequeue_pushable_task()? */
p->sched_class->proxy_tag_curr(rq, p);

Also I forgot how the current context gets put back on the pushable list
once the proxy is done since at that point it is just a preempted task
on the CPU.

John, do you remember how this happens?

I think we might also need a proxy_untag_current() which calls into
enqueue_pushable.*_task() which does the minimal bits of put_prev_task()
undoing the tag and possibly queuing some balance callbacks which would
have been skipped by set_next_task() on not seeing any pushable tasks
at the time of donor pick?

> }
>
> /*

--
Thanks and Regards,
Prateek