Re: [PATCH v3 07/14] sched: Introduce restrict_cpus_allowed_ptr() to limit task CPU affinity

From: Will Deacon
Date: Thu Nov 19 2020 - 11:42:03 EST


On Thu, Nov 19, 2020 at 02:54:32PM +0000, Valentin Schneider wrote:
>
> On 19/11/20 13:13, Will Deacon wrote:
> > On Thu, Nov 19, 2020 at 11:27:55AM +0000, Valentin Schneider wrote:
> >>
> >> On 19/11/20 11:05, Will Deacon wrote:
> >> > On Thu, Nov 19, 2020 at 09:18:20AM +0000, Quentin Perret wrote:
> >> >> > @@ -1937,20 +1931,69 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> >> >> > * OK, since we're going to drop the lock immediately
> >> >> > * afterwards anyway.
> >> >> > */
> >> >> > - rq = move_queued_task(rq, &rf, p, dest_cpu);
> >> >> > + rq = move_queued_task(rq, rf, p, dest_cpu);
> >> >> > }
> >> >> > out:
> >> >> > - task_rq_unlock(rq, p, &rf);
> >> >> > + task_rq_unlock(rq, p, rf);
> >> >>
> >> >> And that's a little odd to have here no? Can we move it back on the
> >> >> caller's side?
> >> >
> >> > I don't think so, unfortunately. __set_cpus_allowed_ptr_locked() can trigger
> >> > migration, so it can drop the rq lock as part of that and end up relocking a
> >> > new rq, which it also unlocks before returning. Doing the unlock in the
> >> > caller is therfore even weirder, because you'd have to return the lock
> >> > pointer or something horrible like that.
> >> >
> >> > I did add a comment about this right before the function and it's an
> >> > internal function to the scheduler so I think it's ok.
> >> >
> >>
> >> An alternative here would be to add a new SCA_RESTRICT flag for
> >> __set_cpus_allowed_ptr() (see migrate_disable() faff in
> >> tip/sched/core). Not fond of either approaches, but the flag thing would
> >> avoid this "quirk".
> >
> > I tried this when I read about the migrate_disable() stuff on lwn, but I
> > didn't really find it any better to work with tbh. It also doesn't help
> > with the locking that Quentin was mentioning, does it? (i.e. you still
> > have to allocate).
> >
>
> You could keep it all bundled within __set_cpus_allowed_ptr() (i.e. not
> have a _locked() version) and use the flag as indicator of any extra work.

Ah, gotcha. Still not convinced it's any better, but I see that it works.

> Also FWIW we have this pattern of pre-allocating pcpu cpumasks
> (select_idle_mask, load_balance_mask), but given this is AIUI a
> very-not-hot path, this might be overkill (and reusing an existing one
> would be on the icky side of things).

I think that makes sense for static masks, but since this is dynamic I was
following the lead of sched_setaffinity().

Will