Re: [PATCH 1/3] sched: select_task_rq() should check cpu_active() like select_fallback_rq()

From: Thomas Gleixner
Date: Sun Oct 11 2015 - 14:59:12 EST


On Sun, 11 Oct 2015, Oleg Nesterov wrote:
> On 10/10, Oleg Nesterov wrote:
> >
> > I do not understand the cpu_active() check in select_fallback_rq().
> > x86 doesn't need it, and the recent commit dd9d3843755d "sched: Fix
> > cpu_active_mask/cpu_online_mask race" documents the fact that on any
> > architecture we can ignore !active starting from CPU_ONLINE stage.
> >
> > But any possible reason why do we need this check in "fallback" must
> > equally apply to select_task_rq().
>
> And I still think this is true, select_task_rq() and select_fallback_rq()
> should use the same check in any case...
>
> > +static inline bool cpu_allowed(int cpu)
> > +{
> > + return cpu_online(cpu) && cpu_active(cpu);
> > +}
> ...
> > @@ -1390,7 +1391,7 @@ int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags)
> > * not worry about this generic constraint ]
> > */
> > if (unlikely(!cpumask_test_cpu(cpu, tsk_cpus_allowed(p)) ||
> > - !cpu_online(cpu)))
> > + !cpu_allowed(cpu)))
> > cpu = select_fallback_rq(task_cpu(p), p);
>
> But as Fengguang reports (thanks a lot!) this change is wrong. It leads
> to another BUG_ON(td->cpu != smp_processor_id()) before ht->park(td->cpu)
> in smpboot_thread_fn().
>
> I should have realized this. smpboot_park_threads() is called after
> CPU_DOWN_PREPARE. And this can break other PF_NO_SETAFFINITY threads.
>
> Perhaps I am totally confused, but to me this looks like another
> indication that select_fallback_rq() should not check cpu_active(),
> or at least this needs some changes...

Well, the real issue is that the bringup and teardown of cpus is
completely assymetric. And as long as we do not fix that, we'll run
into that kind of trouble over and over. We just add more duct tape to
the cpu hotplug code.

The solution to the problem at hand is to have two separate decisions
for threads to be scheduled on a upcoming or downgoing cpu.

We need to allow per cpu kernel threads to be scheduled after the
initial bringup is done and on teardown we must allow them to be
scheduled to the point where the cpu is actually not longer capable to
do so.

For everything else the decision must be at the end of the
bringup. From that point on random threads can be scheduled on the
cpu. On teardown we need to prevent that as the first thing.

We are currently resurrecting the hotplug revamp patch series, which I
sent out a couple of years ago. The goal of this is to make it
completely symmetric and some more to address exactly this kind of
trouble.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/