Re: [PATCH 3/3] sched: start stopper early
From: Oleg Nesterov
Date: Fri Oct 09 2015 - 12:53:04 EST
On 10/09, Oleg Nesterov wrote:
>
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Peter, I tried to compromise you.
> case CPU_ONLINE:
> + stop_machine_unpark(cpu);
> /*
> * At this point a starting CPU has marked itself as online via
> * set_cpu_online(). But it might not yet have marked itself
> @@ -5337,7 +5340,7 @@ static int sched_cpu_active(struct notifier_block *nfb,
> * Thus, fall-through and help the starting CPU along.
> */
> case CPU_DOWN_FAILED:
> - set_cpu_active((long)hcpu, true);
> + set_cpu_active(cpu, true);
On a second thought, we can't do this (and your initial change has
the same problem).
We can not wakeup it before set_cpu_active(). This can lead to the
same problem fixed by dd9d3843755da95f6 "sched: Fix cpu_active_mask/
cpu_online_mask race". The stopper thread can hit
BUG_ON(td->cpu != smp_processor_id()) in smpboot_thread_fn().
Easy to fix, CPU_ONLINE should do set_cpu_active() itself and not
fall through to CPU_DOWN_FAILED,
case CPU_ONLINE:
set_cpu_active(cpu, true);
stop_machine_unpark(cpu);
break;
But. This is another proof that stop_two_cpus() must not rely on
cpu_active().
Right?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/