Re: [RFC PATCH 0/6] stop_machine: kill stop_cpus_mutex and stop_cpus_lock

From: Oleg Nesterov
Date: Fri Jun 26 2015 - 16:47:45 EST


On 06/26, Peter Zijlstra wrote:
>
> On Fri, Jun 26, 2015 at 04:14:55AM +0200, Oleg Nesterov wrote:
> > Not sure.
> >
> > And note that this series kills stop_cpus_mutex, so that multiple
> > stop_cpus()'s / stop_machine()'s can run in parallel if cpumask's
> > do not overlap.
> >
> > Note also the changelog in 6/6, we can simplify + optimize this code
> > a bit more.
> >
> > What do you think?
>
> The problem I have with this is that it makes the better operation
> (stop_two_cpus) slower while improving the worse operation (stop_cpus).
>
> I would much prefer to keep stop_two_cpus() as proposed with taking two
> cpu_stopper::lock instances and replacing the stop_cpu_mutex with a
> percpu-rwsem.

OK, lets avoid cpumask in stop_two_cpus,

int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *arg)
{
struct multi_stop_data msdata;
struct cpu_stop_done done;
struct cpu_stop_work *work1, *work2;

msdata = (struct multi_stop_data){
.fn = fn,
.data = arg,
.num_threads = 2,
.active_cpus = cpumask_of(cpu1),
};

cpu_stop_init_done(&done, 2);
set_state(&msdata, MULTI_STOP_PREPARE);

if (cpu1 > cpu2)
swap(cpu1, cpu2);
work1 = stop_work_alloc_one(cpu1, true);
work2 = stop_work_alloc_one(cpu1, true);

*work1 = *work2 = (struct cpu_stop_work) {
.fn = multi_cpu_stop,
.arg = &msdata,
.done = &done
};

preempt_disable();
cpu_stop_queue_work(cpu1, work1);
cpu_stop_queue_work(cpu2, work2);
preempt_enable();

wait_for_completion(&done.completion);

stop_work_free_one(cpu1);
stop_work_free_one(cpu2);
wake_up(&stop_work_wq);

return done.executed ? done.ret : -ENOENT;
}

2 cmpxchg()'s vs 2 spin_lock()'s. Plus wake_up(), but we can check
waitqueue_active().

Do you think thi will be noticeably slower?

Of course, if it races with another stop_two_cpus/stop_cpus it will
sleep, but in this case we need to wait anyway.


And I don't think that percpu-rwsem instead of stop_cpu_mutex makes
sense. at least I don't understand how can it help. OK, stop_two_cpus()
can use percpu_down_read() to avoid the deadlock with stop_cpus(), but
you still need double-lock... So I don't think this will make it faster,
this will just penalize stop_cpus(). Or I misunderstood.

So I am still not convinced... But probably I am too biased ;)


Btw. I can't understand the cpu_active() checks in stop_two_cpus().
Do we really need them?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/