Re: [PATCH v8 08/19] sched: Reject CPU affinity changes based on task_cpu_possible_mask()

From: Valentin Schneider
Date: Fri Jun 04 2021 - 13:12:00 EST


On 02/06/21 17:47, Will Deacon wrote:
> Reject explicit requests to change the affinity mask of a task via
> set_cpus_allowed_ptr() if the requested mask is not a subset of the
> mask returned by task_cpu_possible_mask(). This ensures that the
> 'cpus_mask' for a given task cannot contain CPUs which are incapable of
> executing it, except in cases where the affinity is forced.
>
> Reviewed-by: Quentin Perret <qperret@xxxxxxxxxx>
> Signed-off-by: Will Deacon <will@xxxxxxxxxx>

One comment/observation below, but regardless:

Reviewed-by: Valentin Schneider <Valentin.Schneider@xxxxxxx>

> ---
> kernel/sched/core.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 0c1b6f1a6c91..b23c7f0ab31a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2347,15 +2347,17 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> u32 flags)
> {
> const struct cpumask *cpu_valid_mask = cpu_active_mask;
> + const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
> unsigned int dest_cpu;
> struct rq_flags rf;
> struct rq *rq;
> int ret = 0;
> + bool kthread = p->flags & PF_KTHREAD;
>
> rq = task_rq_lock(p, &rf);
> update_rq_clock(rq);
>
> - if (p->flags & PF_KTHREAD || is_migration_disabled(p)) {
> + if (kthread || is_migration_disabled(p)) {
> /*
> * Kernel threads are allowed on online && !active CPUs,
> * however, during cpu-hot-unplug, even these might get pushed
> @@ -2369,6 +2371,11 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> cpu_valid_mask = cpu_online_mask;
> }
>
> + if (!kthread && !cpumask_subset(new_mask, cpu_allowed_mask)) {
> + ret = -EINVAL;
> + goto out;
> + }
> +

IIUC this wouldn't be required if guarantee_online_cpus() couldn't build a
mask that extends beyond task_cpu_possible_mask(p): if the new mask doesn't
intersect with that possible mask, it means we're carrying an empty cpumask
and the cpumask_any_and_distribute() below would return nr_cpu_ids, so we'd
bail with -EINVAL.

I don't really see a way around it though due to the expectations behind
guarantee_online_cpus() :/

> /*
> * Must re-check here, to close a race against __kthread_bind(),
> * sched_setaffinity() is not guaranteed to observe the flag.
> --
> 2.32.0.rc0.204.g9fa02ecfa5-goog