Re: [PATCH] hotplug: Optimize {get,put}_online_cpus()

From: Peter Zijlstra
Date: Tue Sep 24 2013 - 12:51:47 EST


On Tue, Sep 24, 2013 at 06:03:59PM +0200, Oleg Nesterov wrote:
> On 09/24, Peter Zijlstra wrote:
> >
> > +static inline void get_online_cpus(void)
> > +{
> > + might_sleep();
> > +
> > + if (current->cpuhp_ref++) {
> > + barrier();
> > + return;
>
> I don't undestand this barrier()... we are going to return if we already
> hold the lock, do we really need it?
>
> The same for put_online_cpus().

to make {get,put}_online_cpus() always behave like per-cpu lock
sections.

I don't think its ever 'correct' for loads/stores to escape the section,
even if not strictly harmful.

> > +void __get_online_cpus(void)
> > {
> > - if (cpu_hotplug.active_writer == current)
> > + if (cpuhp_writer_task == current)
> > return;
>
> Probably it would be better to simply inc/dec ->cpuhp_ref in
> cpu_hotplug_begin/end and remove this check here and in
> __put_online_cpus().

Oh indeed!

> > + if (atomic_dec_and_test(&cpuhp_waitcount) && cpuhp_writer_task)
> > + cpuhp_writer_wake();
>
> cpuhp_writer_wake() here and in __put_online_cpus() looks racy...

Yeah it is. Paul already said.

> But, Peter, the main question is, why this is better than
> percpu_rw_semaphore performance-wise? (Assuming we add
> task_struct->cpuhp_ref).
>
> If the writer is pending, percpu_down_read() does
>
> down_read(&brw->rw_sem);
> atomic_inc(&brw->slow_read_ctr);
> __up_read(&brw->rw_sem);
>
> is it really much worse than wait_event + atomic_dec_and_test?
>
> And! please note that with your implementation the new readers will
> be likely blocked while the writer sleeps in synchronize_sched().
> This doesn't happen with percpu_rw_semaphore.

Good points both, no I don't think there's a significant performance gap
there.

I'm still hoping we can come up with something better though :/ I don't
particularly like either.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/