Re: Performance regression from switching lock to rw-sem foranon-vma tree

From: Ingo Molnar
Date: Tue Jul 23 2013 - 05:45:28 EST



* Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:

> Ingo,
>
> I tried MCS locking to order the writers but it didn't make much
> difference on my particular workload. After thinking about this some
> more, a likely explanation of the performance difference between mutex
> and rwsem performance is:
>
> 1) Jobs acquiring mutex put itself on the wait list only after
> optimistic spinning. That's only 2% of the time on my test workload so
> they access the wait list rarely.
>
> 2) Jobs acquiring rw-sem for write *always* put itself on the wait list
> first before trying lock stealing and optimistic spinning. This creates
> a bottleneck at the wait list, and also more cache bouncing.

Indeed ...

> One possible optimization is to delay putting the writer on the wait
> list till after optimistic spinning, but we may need to keep track of
> the number of writers waiting. We could add a WAIT_BIAS to count for
> each write waiter and remove the WAIT_BIAS each time a writer job
> completes. This is tricky as I'm changing the semantics of the count
> field and likely will require a number of changes to rwsem code. Your
> thoughts on a better way to do this?

Why not just try the delayed addition approach first? The spinning is time
limited AFAICS, so we don't _have to_ recognize those as writers per se,
only if the spinning fails and it wants to go on the waitlist. Am I
missing something?

It will change patterns, it might even change the fairness balance - but
is a legit change otherwise, especially if it helps performance.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/