Re: [PATCH v8 08/19] sched: Reject CPU affinity changes based on task_cpu_possible_mask()

From: Will Deacon
Date: Mon Jun 07 2021 - 18:44:04 EST


On Fri, Jun 04, 2021 at 06:11:52PM +0100, Valentin Schneider wrote:
> On 02/06/21 17:47, Will Deacon wrote:
> > Reject explicit requests to change the affinity mask of a task via
> > set_cpus_allowed_ptr() if the requested mask is not a subset of the
> > mask returned by task_cpu_possible_mask(). This ensures that the
> > 'cpus_mask' for a given task cannot contain CPUs which are incapable of
> > executing it, except in cases where the affinity is forced.
> >
> > Reviewed-by: Quentin Perret <qperret@xxxxxxxxxx>
> > Signed-off-by: Will Deacon <will@xxxxxxxxxx>
>
> One comment/observation below, but regardless:
>
> Reviewed-by: Valentin Schneider <Valentin.Schneider@xxxxxxx>

Thanks!

> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 0c1b6f1a6c91..b23c7f0ab31a 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2347,15 +2347,17 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> > u32 flags)
> > {
> > const struct cpumask *cpu_valid_mask = cpu_active_mask;
> > + const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
> > unsigned int dest_cpu;
> > struct rq_flags rf;
> > struct rq *rq;
> > int ret = 0;
> > + bool kthread = p->flags & PF_KTHREAD;
> >
> > rq = task_rq_lock(p, &rf);
> > update_rq_clock(rq);
> >
> > - if (p->flags & PF_KTHREAD || is_migration_disabled(p)) {
> > + if (kthread || is_migration_disabled(p)) {
> > /*
> > * Kernel threads are allowed on online && !active CPUs,
> > * however, during cpu-hot-unplug, even these might get pushed
> > @@ -2369,6 +2371,11 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> > cpu_valid_mask = cpu_online_mask;
> > }
> >
> > + if (!kthread && !cpumask_subset(new_mask, cpu_allowed_mask)) {
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > +
>
> IIUC this wouldn't be required if guarantee_online_cpus() couldn't build a
> mask that extends beyond task_cpu_possible_mask(p): if the new mask doesn't
> intersect with that possible mask, it means we're carrying an empty cpumask
> and the cpumask_any_and_distribute() below would return nr_cpu_ids, so we'd
> bail with -EINVAL.
>
> I don't really see a way around it though due to the expectations behind
> guarantee_online_cpus() :/

Mostly agreed. I started out hacking SCA and only then started knocking the
callers on the head. However, given how many callers there are for this
thing, I'm much more comfortable having the mask check there to ensure that
we return an error if the requested mask contains CPUs on which we're unable
to run, and yes, guarantee_online_cpus() is one such caller.

Will