Re: [RFC PATCH] kernel: sched: Provide a pointer to the valid CPU mask

From: Sebastian Andrzej Siewior
Date: Wed Apr 05 2017 - 04:38:13 EST


On 2017-04-05 09:39:43 [+0200], Ingo Molnar wrote:
>
> So maybe we could add the following facility:
>
> ptr = sched_migrate_to_cpu_save(cpu);
>
> ...
>
> sched_migrate_to_cpu_restore(ptr);
>
> ... and use it in the cpufreq code. Then -rt could simply define migrate_disable()
> to be:
>
> ptr = sched_migrate_to_cpu_save(raw_smp_processor_id());
>
> and define migrate_enable() as:
>
> sched_migrate_to_cpu_restore(ptr);
>
> ... or such.
>
> In the cpu == current_cpu case it would be super fast - otherwise it would migrate
> over to the target CPU first. Also note that this facility is strictly a special
> case for single-CPU masks and migrations - i.e. the constant pointer cpumask
> optimization would always apply.
>
> Note that due to the use of the 'ptr' local variable the interface nests
> naturally, so this would be a legitimate use:
>
> ptr = sched_migrate_to_cpu_save(cpu);
>
> ...
> migrate_disable();
> ...
> migrate_enable();
> ...
>
> sched_migrate_to_cpu_restore(ptr);
>
> I.e. my proposal would be to essentially upstream the -rt migrate_disable()
> facility in a slightly more generic form that would fit the cpufreq usecase.
>
> I bet a number of the current driver's mucking with cpumask would also fit this
> new API.
>
> Does this make sense?

It kind of does. If you want to allow migration to different CPU then it
will might make things little complicated because we need to supported
nested migrate_disable() and we can't (must not) change the mask while
nesting.
Other than that, I am not sure the cpufreq usecase is valid.
schedule_work_on() looks better but maybe not as fast as the proposed
sched_migrate_to_cpu_save(). Also, some users look wrong:
[PATCH] CPUFREQ: Loongson2: drop set_cpus_allowed_ptr()
https://www.linux-mips.org/archives/linux-mips/2017-04/msg00042.html

and I received offlist mail pointing to the other cpufreq users. So here
I am waiting for some feedback from the cpufreq maintainer.

But I get your point. Other than super-fast switching to a specific CPU
for $reason we could replace a few preempt_disable() invocation which
are only there due to per-CPU variables. So let me hack this into -RT
and come backâ

> Thanks,
>
> Ingo

Sebastian