Re: [PATCH] rwsem: reduce spinlock contention in wakeup code path

From: Linus Torvalds
Date: Sun Sep 29 2013 - 19:26:28 EST


On Sun, Sep 29, 2013 at 4:06 PM, Davidlohr Bueso <davidlohr@xxxxxx> wrote:
>>
>> Btw, I really hate that thing. I think we should turn it back into a
>> spinlock. None of what it protects needs a mutex or an rwsem.
>
> The same should apply to i_mmap_mutex, having a similar responsibility
> to the anon-vma lock with file backed pages. A few months ago I had
> suggested changing that lock to rwsem, giving some pretty reasonable
> performance improvement numbers.
>
> http://lwn.net/Articles/556342/

Ok, that's pretty convincing too.

Side note: are you sure that the i_mmap_mutex needs to be a sleeping
lock at all? It's documented to nest outside the anon_vma->rwsem, so
as long as that is a sleeping lock, the i_mmap_mutex needs to be one
too, but looking at the actual users, most of them seem to be *very*
similar to the anon_vma->rwsem users. It is a very close cousin to the
anon_vma->rwsem, after all (just for file-backed pages rather than
anonymous ones). No?

I dunno. Maybe the ranges are too big and it really has latency
issues, the few I looked at looked like fairly trivial interval-tree
operations, though.

And your numbers for Ingo's patch:

> After testing Ingo's anon-vma rwlock_t conversion (v2) on a 8 socket, 80
> core system with aim7, I am quite surprised about the numbers -
> considering the lack of queuing in rwlocks. A lot of the tests didn't
> show hardly any difference, but those that really contend this lock
> (with high amounts of users) benefited quite nicely:
>
> Alltests: +28% throughput after 1000 users and runtime was reduced from
> 7.2 to 6.6 secs.
>
> Custom: +61% throughput after 100 users and runtime was reduced from 7
> to 4.9 secs.
>
> High_systime: +40% throughput after 1000 users and runtime was reduced
> from 19 to 15.5 secs.
>
> Shared: +30.5% throughput after 100 users and runtime was reduced from
> 6.5 to 5.1 secs.
>
> Short: Lots of variance in the numbers, but avg of +29% throughput - no
> particular performance degradation either.

Are just overwhelming, in my opinion. The conversion *from* a spinlock
never had this kind of support behind it.

Btw, did anybody run Ingo's patch with lockdep and the spinlock sleep
debugging code to verify that we haven't introduced any problems wrt
sleeping since the lock was converted into a rw-semaphore?

Because quite frankly, considering these kinds of numbers, I really
don't see how we could possibly make excuses for keeping that
rw-semaphore unless there is some absolutely _horrible_ latency issue?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/