Re: [PATCH v11 08/16] sched: Allow task CPU affinity to be restricted on asymmetric systems

From: Will Deacon
Date: Wed Aug 18 2021 - 06:42:38 EST


Hi Peter,

On Tue, Aug 17, 2021 at 05:10:53PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 30, 2021 at 12:24:35PM +0100, Will Deacon wrote:
> > @@ -2783,20 +2778,173 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
> >
> > __do_set_cpus_allowed(p, new_mask, flags);
> >
> > - return affine_move_task(rq, p, &rf, dest_cpu, flags);
> > + if (flags & SCA_USER)
> > + release_user_cpus_ptr(p);
> > +
> > + return affine_move_task(rq, p, rf, dest_cpu, flags);
> >
> > out:
> > - task_rq_unlock(rq, p, &rf);
> > + task_rq_unlock(rq, p, rf);
> >
> > return ret;
> > }
>
> > +void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
> > +{
> > + unsigned long flags;
> > + struct cpumask *mask = p->user_cpus_ptr;
> > +
> > + /*
> > + * Try to restore the old affinity mask. If this fails, then
> > + * we free the mask explicitly to avoid it being inherited across
> > + * a subsequent fork().
> > + */
> > + if (!mask || !__sched_setaffinity(p, mask))
> > + return;
> > +
> > + raw_spin_lock_irqsave(&p->pi_lock, flags);
> > + release_user_cpus_ptr(p);
> > + raw_spin_unlock_irqrestore(&p->pi_lock, flags);
> > +}
>
> Both these are a problem on RT.

Ah, sorry. I didn't realise you couldn't _free_ with a raw lock held in RT.
Is there somewhere I can read up on that?

> The easiest recourse is simply never freeing the CPU mask (except on
> exit). The alternative is something like the below I suppose..
>
> I'm leaning towards the former option, wdyt?

Defering the freeing until exit feels like a little fiddly, as we still
want to clear ->user_cpus_ptr on affinity changes when SCA_USER is set
so we'd have to keep track of the mask somewhere and reuse it instead
of allocating a new one if we need it later on. Do-able, but feels a bit
nasty, particular across fork().

As for your other suggestion:

> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2733,6 +2733,7 @@ static int __set_cpus_allowed_ptr_locked
> const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
> const struct cpumask *cpu_valid_mask = cpu_active_mask;
> bool kthread = p->flags & PF_KTHREAD;
> + struct cpumask *user_mask = NULL;
> unsigned int dest_cpu;
> int ret = 0;
>
> @@ -2792,9 +2793,13 @@ static int __set_cpus_allowed_ptr_locked
> __do_set_cpus_allowed(p, new_mask, flags);
>
> if (flags & SCA_USER)
> - release_user_cpus_ptr(p);
> + swap(user_mask, p->user_cpus_ptr);
> +
> + ret = affine_move_task(rq, p, rf, dest_cpu, flags);
> +
> + kfree(user_mask);
>
> - return affine_move_task(rq, p, rf, dest_cpu, flags);
> + return ret;
>
> out:
> task_rq_unlock(rq, p, rf);
> @@ -2954,8 +2959,10 @@ void relax_compatible_cpus_allowed_ptr(s
> return;
>
> raw_spin_lock_irqsave(&p->pi_lock, flags);
> - release_user_cpus_ptr(p);
> + p->user_cpus_ptr = NULL;
> raw_spin_unlock_irqrestore(&p->pi_lock, flags);
> +
> + kfree(mask);

I think the idea looks good, but perhaps we could wrap things up a bit:

/* Comment about why this is useful with RT */
static cpumask_t *clear_user_cpus_ptr(struct task_struct *p)
{
struct cpumask *user_mask = NULL;

swap(user_mask, p->user_cpus_ptr);
return user_mask;
}

void release_user_cpus_ptr(struct task_struct *p)
{
kfree(clear_user_cpus_ptr(p));
}

Then just use clear_user_cpus_ptr() in sched/core.c where we know what
we're doing (well, at least one of us does!).

Will