Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes

From: Tejun Heo
Date: Wed Jul 13 2016 - 17:01:43 EST


On Wed, Jul 13, 2016 at 10:51:02PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 13, 2016 at 04:39:44PM -0400, Tejun Heo wrote:
> So, IIRC, the trade-off is a full memory barrier in read_lock and
> read_unlock() vs sync_sched() in write.
>
> Full memory barriers are expensive and while the combined cost might
> well exceed the cost of the sync_sched() it doesn't suffer the latency
> issues.

Given the way read side is used for percpu_rwsem, full memory barrier
on reader side shouldn't matter at all. The paths are not *that* hot.

> Not sure if we can frob the two in a single codebase, but I can have a
> poke if Oleg or Paul doesn't beat me to it.

At the simplest, it can be rwsem equivalence of lglock.

--
tejun