Re: [PATCH 1/3] sched: select_task_rq() should check cpu_active() like select_fallback_rq()

From: Oleg Nesterov
Date: Mon Oct 12 2015 - 13:37:36 EST


On 10/12, Peter Zijlstra wrote:
>
> On Sat, Oct 10, 2015 at 08:53:09PM +0200, Oleg Nesterov wrote:
> > I do not understand the cpu_active() check in select_fallback_rq().
> > x86 doesn't need it, and the recent commit dd9d3843755d "sched: Fix
> > cpu_active_mask/cpu_online_mask race" documents the fact that on any
> > architecture we can ignore !active starting from CPU_ONLINE stage.
> >
> > But any possible reason why do we need this check in "fallback" must
> > equally apply to select_task_rq().
>
> So the reason, from vague memory, is that we want to allow per-cpu
> threads to start/stop before/after active.

I simply can't understand... To me it looks as if we can simply remove
the cpu_active() check in select_fallback_rq().

If we race with cpu_down(), cpu_active() is cleared by sched_cpu_inactive()
which is CPU_PRI_SCHED_INACTIVE = INT_MIN + 1 priority, so it seems that
only cpuset_cpu_inactive() can be called after that and this check is
obviously racy anyway.

As for cpu_up(), I do not see any arch which does set_cpu_active(true),
they all do set_cpu_online(true) which also marks it active.

So why we can't simply remove select_fallback_rq()->cpu_active() ?

> active 'should' really only govern load-balancer bits or so.

OK, I don't understand the usage of cpu_active_mask in kernel/sched/,
and of course I could easily miss something else. But I doubt very
much this check can save us from something bad simply because it is
racy.

Yes, we also have synchronize_sched() after CPU_DOWN_PREPARE, but
the only thing we do before stop_machine() is smpboot_park_threads().
So this can help (say) set_cpus_allowed_ptr() which uses active_mask,
but I don't see how this can connect to ttwu paths.

And again. If there is some reason we can not do this, say, because
ipi to non-active CPU can trigger a warning, or something else, then
we can hit the same problem because select_task_rq() does not check
cpu_active(). The kernel threads like stoppers/workers are probably
fine in any case, but userspace can trigger the race with cpu_up/down.


In short, I am all confused ;)

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/