Re: [RFC PATCH v2 01/10] CPU hotplug: Provide APIs for "light"atomic readers to prevent CPU offline

From: Tejun Heo
Date: Wed Dec 05 2012 - 13:27:31 EST


Hello, Oleg.

On Wed, Dec 05, 2012 at 07:15:24PM +0100, Oleg Nesterov wrote:
> On 12/05, Tejun Heo wrote:
> > Replacing get_online_cpus() w/ percpu_rwsem is great but this thread
> > is about replacing preempt_disable with something finer grained and
> > less heavy on the writer side
>
> If only I understood why preempt_disable() is bad ;)
>
> OK, I guess "less heavy on the writer side" is the hint, and in the
> previous email you mentioned that "stop_machine() itself is extremely
> heavy".
>
> Looks like, you are going to remove stop_machine() from cpu_down ???

Yeah, that's what Srivatsa is trying to do. The problem seems to be
that cpu up/down is very frequent on certain mobile platforms for
power management and as currently implemented cpu hotplug is too heavy
and latency-inducing.

> > The problem seems that we don't have percpu_rwlock yet. It shouldn't
> > be too difficult to implement, right?
>
> Oh, I am not sure... unless you simply copy-and-paste the lglock code
> and replace spinlock_t with rwlock_t.

Ah... right, so that's where brlock ended up. So, lglock is the new
thing and brlock is a wrapper around it.

> We probably want something more efficient, but I bet we can't avoid
> the barriers on the read side.
>
> And somehow we should avoid the livelocks. Say, we can't simply add
> the per_cpu_reader_counter, _read_lock should spin if the writer is
> active. But at the same time _read_lock should be recursive.

I think we should just go with lglock. It does involve local atomic
ops but atomic ops themselves aren't that expensive and it's not like
we can avoid memory barriers. Also, that's the non-sleeping
counterpart of percpu_rwsem. If it's not good enough for some reason,
we should improve it rather than introducing something else.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/