Re: [RFC PATCH] kernel: sched: Provide a pointer to the valid CPU mask
From: Thomas Gleixner
Date: Thu Apr 06 2017 - 06:37:04 EST
On Thu, 6 Apr 2017, Ingo Molnar wrote:
> Sorry if this is a back and forth - I was somehow convinced that we do need to
> frob the cpus_allowed mask to get this functionality - but in hindsight I think
> the counter should be enough.
>
> I.e. just have a counter and these two APIs:
>
> static inline void migrate_disable(void)
> {
> current->migration_disabled++;
> }
>
> ...
>
> static inline void migrate_enable(void)
> {
> current->migration_disabled--;
> }
>
> ... and make sure the scheduler migration code plus the CPU hotplug code considers
> the counter.
We tried that some time ago, but there are a lot of places in the scheduler
which just rely on the cpus_allowed_mask, so we have to chase all of them
and make sure that new users do not ignore that counter. That's why we
chose the cpus_allowed_mask approach. And I still think that's the proper
thing to do.
Also this still requires magic in the context switch where we need to
transport that information to the CPU, so it can't go away.
CPU hotplug has other issues. If migration is disabled, then blocking out
the hotplug code is probably easy to do, but you need to undo the blocking
when the CPU is not longer pinned, i.e. if the last task enabled migration
again.
To make that all less horrible, we need to fix the stupid nested usage of
get_online_cpus() first. That'd allow us to convert the hotplug locking
into a percpu_rwsem, which is desired anyway as it makes get_online_cpus()
cheap.
So then the migration stuff becomes:
void migrate_disable(void)
{
if (in_atomic() || irqs_disabled())
return;
if (!current->migration_disabled) {
percpu_down_read_preempt_disable(hotplug_rwsem);
current->migration_disabled++;
preempt_enable();
} else {
current->migration_disabled++;
}
}
void migrate_enable(void)
{
if (in_atomic() || irqs_disabled())
return;
if (current->migration_disabled == 1) {
preempt_disable();
current->migration_disabled--;
percpu_up_read_preempt_enable(hotplug_rwsem);
} else {
current->migration_disabled--;
}
}
That want's to have some debug extras: see the current RT version of it.
Coming back to the cpus_allowed mask.
Most of the cpus_allowed usage outside of the scheduler is broken and we
really should clean that up now and then restrict access to cpus_allowed.
The easy ones (and the majority) are those:
cpumask_copy(saved_mask, ¤t->cpus_allowed);
set_cpus_allowed_ptr(current, cpumask_of(XXX));
do_something();
set_cpus_allowed_ptr(current, saved_mask);
That code is not at all protected against user space changing affinity or
cpus going down. We better do that now before doing anything about
cpus_allowed in the core code.
Thanks,
tglx