Re: [PATCH 10/43] stop_machine: reimplement without using workqueue

From: Oleg Nesterov
Date: Mon Mar 01 2010 - 10:40:19 EST


On 03/02, Tejun Heo wrote:
> > and more importantly, if it was possible
> > stop_machine_cpu_callback(CPU_POST_DEAD) (which is called after
> > cpu_hotplug_done()) could race with stop_machine().
> > stop_machine_cpu_callback(CPU_POST_DEAD) relies on fact that this
> > thread has already called schedule() and it can't be woken until
> > kthread_stop() sets ->should_stop.
> Hmmm... I'm probably missing something but I don't see how
> stop_machine_cpu_callback(CPU_POST_DEAD) depends on stop_cpu() thread
> already parked in schedule(). Can you elaborate a bit?

Suppose that, when stop_machine_cpu_callback(CPU_POST_DEAD) is called,
that stop_cpu() thread T is still running and it is going to check state
before schedule().

CPU_POST_DEAD is called after cpu_hotplug_done(), another CPU can do
stop_machine() and set STOPMACHINE_PREPARE.

If T sees state == STOPMACHINE_PREPARE it will join the game, but it
wasn't counted in thread_ack counter, it is not cpu-bound, etc.

> >> int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus)
> >> {
> >> ...
> >> /* Schedule the stop_cpu work on all cpus: hold this CPU so one
> >> * doesn't hit this CPU until we're ready. */
> >> get_cpu();
> >> + for_each_online_cpu(i)
> >> + wake_up_process(*per_cpu_ptr(stop_machine_threads, i));
> >
> > I think the comment is wrong, and we need preempt_disable() instead
> > of get_cpu(). We shouldn't worry about this CPU, but we need to ensure
> > the woken real-time thread can't preempt us until we wake up them all.
> get_cpu() and preempt_disable() are exactly the same thing, aren't
> they?


> Do you think get_cpu() is wrong there for some reason?

No. I think that the comment is confusing, and preempt_disable()
"looks" more correct.

In any case, this is very minor, please ignore. In fact, I mentioned
this only because this email was much longer initially, at first I
thought I noticed the bug, but I was wrong ;)


