Re: [RFC PATCH] kernel: sched: Provide a pointer to the valid CPU mask
From: Ingo Molnar
Date: Wed Apr 05 2017 - 03:39:52 EST
* Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote:
> In commit 4b53a3412d66 ("sched/core: Remove the tsk_nr_cpus_allowed()
> wrapper") the tsk_nr_cpus_allowed() wrapper was removed. There was not
> much difference in !RT but in RT we used this to implement
> migrate_disable(). Within a migrate_disable() section the CPU mask is
> restricted to single CPU while the "normal" CPU mask remains untouched.
>
> As an alternative implementation Ingo suggested to use
> struct task_struct {
> const cpumask_t *cpus_ptr;
> cpumask_t cpus_mask;
> };
> with
> t->cpus_allowed_ptr = &t->cpus_allowed;
>
> In -RT we then can switch the cpus_ptr to
> t->cpus_allowed_ptr = &cpumask_of(task_cpu(p));
>
> in a migration disabled region. The rules are simple:
> - Code that 'uses' ->cpus_allowed would use the pointer.
> - Code that 'modifies' ->cpus_allowed would use the direct mask.
>
> While converting the existing users I tried to stick with the rules
> above howeverâ well mostly CPUFREQ tries to temporary switch the CPU
> mask to do something on a certain CPU and then switches the mask back it
> its original value. So in theory `cpus_ptr' could or should be used.
> However if this is invoked in a migration disabled region (which is not
> the case because it would require something like preempt_disable() and
> set_cpus_allowed_ptr() might sleep so it can't be) then the "restore"
> part would restore the wrong mask. So it only looks strange and I go for
> the pointerâ
So maybe we could add the following facility:
ptr = sched_migrate_to_cpu_save(cpu);
...
sched_migrate_to_cpu_restore(ptr);
... and use it in the cpufreq code. Then -rt could simply define migrate_disable()
to be:
ptr = sched_migrate_to_cpu_save(raw_smp_processor_id());
and define migrate_enable() as:
sched_migrate_to_cpu_restore(ptr);
... or such.
In the cpu == current_cpu case it would be super fast - otherwise it would migrate
over to the target CPU first. Also note that this facility is strictly a special
case for single-CPU masks and migrations - i.e. the constant pointer cpumask
optimization would always apply.
Note that due to the use of the 'ptr' local variable the interface nests
naturally, so this would be a legitimate use:
ptr = sched_migrate_to_cpu_save(cpu);
...
migrate_disable();
...
migrate_enable();
...
sched_migrate_to_cpu_restore(ptr);
I.e. my proposal would be to essentially upstream the -rt migrate_disable()
facility in a slightly more generic form that would fit the cpufreq usecase.
I bet a number of the current driver's mucking with cpumask would also fit this
new API.
Does this make sense?
Thanks,
Ingo