Re: [RFC][PATCH 2/2] sched: proxy-exec: Add allow/prevent_migration hooks in the sched classes for proxy_tag_curr
From: John Stultz
Date: Thu Mar 05 2026 - 02:31:56 EST
On Wed, Mar 4, 2026 at 5:18 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Mar 04, 2026 at 06:38:10AM +0000, John Stultz wrote:
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 55bafb1585eca..174a3177a3a6b 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -6712,11 +6712,19 @@ static inline void proxy_tag_curr(struct rq *rq, struct task_struct *owner)
> > * However, the chosen/donor task *and* the mutex owner form an
> > * atomic pair wrt push/pull.
> > *
> > - * Make sure owner we run is not pushable. Unfortunately we can
> > - * only deal with that by means of a dequeue/enqueue cycle. :-/
> > + * Make sure owner we run is not pushable.
> > */
> > - dequeue_task(rq, owner, DEQUEUE_NOCLOCK | DEQUEUE_SAVE);
> > - enqueue_task(rq, owner, ENQUEUE_NOCLOCK | ENQUEUE_RESTORE);
> > + if (owner->sched_class->prevent_migration)
> > + owner->sched_class->prevent_migration(rq, owner);
> > +}
> > +
> > +static inline void proxy_untag_prev(struct rq *rq, struct task_struct *prev)
> > +{
> > + if (!sched_proxy_exec())
> > + return;
> > +
> > + if (prev->sched_class->allow_migration)
> > + prev->sched_class->allow_migration(rq, prev);
> > }
> >
> > /*
> > @@ -6874,7 +6882,7 @@ static void __sched notrace __schedule(int sched_mode)
> > if (!task_current_donor(rq, next))
> > proxy_tag_curr(rq, next);
> > if (!(!preempt && prev_state) && prev != prev_donor)
> > - proxy_tag_curr(rq, prev);
> > + proxy_untag_prev(rq, prev);
> >
> > /*
> > * The membarrier system call requires each architecture
>
> Yeah, not a fan in this form.
>
> I really don't think we need new class callbacks for this. Esp. not
> named like this, which is quite terrible.
Yeah, apologies, I was cringing a bit on the prevent/allow_migration()
names given they alias the other migration related functions, but just
wasn't feeling creative enoguh to come up with something else (we do
want to prevent rq->curr from being migrated when we're proxying).
> Note how migrate_disable() and migrate_enable() use ->set_cpus_allowed()
> and are both very much about preventing and allowing migration.
>
> Also note how set_next_task() / put_prev_task() already very much do
> what you want; except they only work for the donor.
Yeah, I like the set_proxy_task()/put_proxy_task() names *much*
better. Thank you for the suggestion!
> Further note that the only reason this proxy_tag_curr() thing lives
> where it does is because it depends on the value of current. However if
> you do this, you no longer have that constraint and then there is a much
> saner place for all this.
>
>
> So I think I prefer (ab)using the migrate_disable() infrastructure,
> simply because it would avoid having to do an (indirect) class call
> entirely -- but looking at how RT/DL handle this, I think there's bugs
> there.
>
> Specifically, something like pick_next_pushable_task() should never
> return something that has ->migration_disabled set, it should continue
> iterating the list until it finds one that hasn't.
>
>
> Anyway, without having tested anything at all, how crazy would something
> like this be?
It needed some rework to handle the pick_again looping properly as you
pointed out on irc, but also the migration_disabled/migration_flags
need to be handled on fork or else we end up with tasks that are stuck
non-migratable w/ MDF_PROXY that never gets dropped.
After working around those, I'm still sometimes hitting warnings and
issues around __set_cpus_allowed_ptr_locked() which sounds like what K
Prateek mentioned. But I'll have to dig more tomorrow on it.
thanks
-john