Re: [PATCH v3 11/14] sched: Reject CPU affinity changes based on arch_cpu_allowed_mask()

From: Will Deacon
Date: Thu Nov 19 2020 - 06:07:34 EST


On Thu, Nov 19, 2020 at 09:47:44AM +0000, Quentin Perret wrote:
> On Friday 13 Nov 2020 at 09:37:16 (+0000), Will Deacon wrote:
> > Reject explicit requests to change the affinity mask of a task via
> > set_cpus_allowed_ptr() if the requested mask is not a subset of the
> > mask returned by arch_cpu_allowed_mask(). This ensures that the
> > 'cpus_mask' for a given task cannot contain CPUs which are incapable of
> > executing it, except in cases where the affinity is forced.
> >
> > Signed-off-by: Will Deacon <will@xxxxxxxxxx>
> > ---
> > kernel/sched/core.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 8df38ebfe769..13bdb2ae4d3f 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1877,6 +1877,7 @@ static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
> > struct rq_flags *rf)
> > {
> > const struct cpumask *cpu_valid_mask = cpu_active_mask;
> > + const struct cpumask *cpu_allowed_mask = arch_cpu_allowed_mask(p);
> > unsigned int dest_cpu;
> > int ret = 0;
> >
> > @@ -1887,6 +1888,9 @@ static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
> > * Kernel threads are allowed on online && !active CPUs
> > */
> > cpu_valid_mask = cpu_online_mask;
> > + } else if (!cpumask_subset(new_mask, cpu_allowed_mask)) {
> > + ret = -EINVAL;
> > + goto out;
>
> So, IIUC, this should make the sched_setaffinity() syscall fail and
> return -EINVAL to userspace if it tries to put 64bits CPUs in the
> affinity mask of a 32 bits task, which I think makes sense.
>
> But what about affinity change via cpusets? e.g., if a 32 bit task is
> migrated to a cpuset with 64 bit CPUs, then the migration will be
> 'successful' and the task will appear to be in the destination cgroup,
> but the actual affinity of the task will be something completely
> different?

Yeah, the cpuset code ignores the return value of set_cpus_allowed_ptr() in
update_tasks_cpumask() so the failure won't be propagated, but then again I
think that might be the right thing to do. Nothing prevents 32-bit and
64-bit tasks from co-existing in the same cpuseti afaict, so forcing the
64-bit tasks onto the 32-bit-capable cores feels much worse than the
approach taken here imo. Nothing says we _have_ to schedule on all of the
cores in the mask.

The interesting case is what happens if the cpuset for a 32-bit task is
changed to contain only the 64-bit-only cores. I think that's a userspace
bug, but the fallback rq selection should avert disaster.

Will