Re: [RFC PATCH 0/6] stop_machine: kill stop_cpus_mutex and stop_cpus_lock

From: Oleg Nesterov
Date: Mon Jun 29 2015 - 00:04:40 EST


On 06/26, Oleg Nesterov wrote:
>
> 2 cmpxchg()'s vs 2 spin_lock()'s. Plus wake_up(), but we can check
> waitqueue_active().
>
> Do you think thi will be noticeably slower?
>
> Of course, if it races with another stop_two_cpus/stop_cpus it will
> sleep, but in this case we need to wait anyway.
>
>
> And I don't think that percpu-rwsem instead of stop_cpu_mutex makes
> sense. at least I don't understand how can it help. OK, stop_two_cpus()
> can use percpu_down_read() to avoid the deadlock with stop_cpus(), but
> you still need double-lock... So I don't think this will make it faster,
> this will just penalize stop_cpus(). Or I misunderstood.
>
> So I am still not convinced... But probably I am too biased ;)

Yes... I'll probably try to make v2, this version is overcomplicated
and buggy.


> Btw. I can't understand the cpu_active() checks in stop_two_cpus().
> Do we really need them?

Ah, please ignore.

Yes, we can't rely on stopper->enabled check in cpu_stop_queue_work(),
cpu_stop_signal_done() does not update multi_stop_data->num_threads /
->thread_ack. So we need to ensure that cpu_online() == T for both CPUS
or multi_cpu_stop() can hang.

But we can't use cpu_online() instead, take_cpu_down() can be already
queued.

So this relies on the fact that CPU_DOWN_PREPARE (which removes CPU
from cpu_active_mask) is called before stop_machine(take_cpu_down) and
we do not care that cpu_active() is not stable; if we see cpu_active()
cpu_online() can't change unders us because take_cpu_down() was not
queued.

If we change stop_two_cpus() to use stop_work_alloc_one() it can use
cpu_online(),

int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *arg)
{
struct cpu_stop_work *work1, *work2;
struct cpu_stop_done done;
struct multi_stop_data msdata = {
.fn = fn,
.data = arg,
.num_threads = 2,
.active_cpus = cpumask_of(cpu1),
};

set_state(&msdata, MULTI_STOP_PREPARE);
cpu_stop_init_done(&done, 2);

if (cpu1 > cpu2)
swap(cpu1, cpu2);

work1 = stop_work_alloc_one(cpu1, true);
work2 = stop_work_alloc_one(cpu2, true);
/* stop_machine() is blocked, cpu can't go away */
if (cpu_online(cpu1) && cpu_online(cpu2)) {
work1->fn = work2->fn = multi_cpu_stop;
work1->arg = work2->arg = &msdata;
work1->done = work2->done = &done;

preempt_disable();
cpu_stop_queue_work(cpu1, work1);
cpu_stop_queue_work(cpu2, work2);
preempt_enable();
wait_for_completion(&done.completion);
}

stop_work_free_one(cpu1);
stop_work_free_one(cpu2);
stop_work_wake_up();

return done.executed ? done.ret : -ENOENT;
}

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/