Re: [PATCH v3 07/14] sched: Introduce restrict_cpus_allowed_ptr() to limit task CPU affinity

From: Will Deacon
Date: Thu Nov 19 2020 - 08:13:33 EST


On Thu, Nov 19, 2020 at 12:47:34PM +0000, Valentin Schneider wrote:
>
> On 13/11/20 09:37, Will Deacon wrote:
> > Asymmetric systems may not offer the same level of userspace ISA support
> > across all CPUs, meaning that some applications cannot be executed by
> > some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do
> > not feature support for 32-bit applications on both clusters.
> >
> > Although userspace can carefully manage the affinity masks for such
> > tasks, one place where it is particularly problematic is execve()
> > because the CPU on which the execve() is occurring may be incompatible
> > with the new application image. In such a situation, it is desirable to
> > restrict the affinity mask of the task and ensure that the new image is
> > entered on a compatible CPU.
>
> > From userspace's point of view, this looks the same as if the
> > incompatible CPUs have been hotplugged off in its affinity mask.
>
> {pedantic reading warning}
>
> Hotplugged CPUs *can* be set in a task's affinity mask, though interact
> weirdly with cpusets [1]. Having it be the same as hotplug would mean
> keeping incompatible CPUs allowed in the affinity mask, but preventing them
> from being picked via e.g. is_cpu_allowed().

Sure, but I was talking about what userspace sees, and I don't think it ever
sees CPUs that have been hotplugged off, right? That is, sched_getaffinity()
masks its result with the active_mask.

> I was actually hoping this could be a feasible approach, i.e. have an
> extra CPU active mask filter for any task:
>
> cpu_active_mask & arch_cpu_allowed_mask(p)
>
> rather than fiddle with task affinity. Sadly this would also require fixing
> up pretty much any place that uses cpu_active_mask(), and probably places
> that use p->cpus_ptr as well. RT / DL balancing comes to mind, because that
> uses either a task's affinity or a CPU's root domain (which reflects the
> cpu_active_mask) to figure out where to push a task.

Yeah, I tried this at one point and you end up playing whack-a-mole trying
to figure out why a task got killed. p->cpus_ptr is used all over the place,
and I think if we took this approach then we couldn't realistically remove
the sanity check on the ret-to-user path.

> So while I'm wary of hacking up affinity, I fear it might be the lesser
> evil :(

It's the best thing I've been able to come up with.

Will