Re: [RFC][PATCH 0/3] locking/mutex: Rewrite basic mutex

From: Peter Zijlstra
Date: Thu Aug 25 2016 - 12:36:18 EST


On Thu, Aug 25, 2016 at 12:33:04PM -0400, Waiman Long wrote:
> On 08/25/2016 11:43 AM, Peter Zijlstra wrote:
> >On Tue, Aug 23, 2016 at 06:13:43PM -0700, Jason Low wrote:
> >>I tested this patch on an 8 socket system with the high_systime AIM7
> >>workload with diskfs. The patch provided big performance improvements in
> >>terms of throughput in the highly contended cases.
> >>
> >>-------------------------------------------------
> >>| users | avg throughput | avg throughput |
> >> | without patch | with patch |
> >>-------------------------------------------------
> >>| 10 - 90 | 13,943 JPM | 14,432 JPM |
> >>-------------------------------------------------
> >>| 100 - 900 | 75,475 JPM | 102,922 JPM |
> >>-------------------------------------------------
> >>| 1000 - 1900 | 77,299 JPM | 115,271 JPM |
> >>-------------------------------------------------
> >>
> >>Unfortunately, at 2000 users, the modified kernel locked up.
> >>
> >># INFO: task reaim:<#> blocked for more than 120 seconds.
> >>
> >>So something appears to be buggy.
> >So with the previously given changes to reaim, I get the below results
> >on my 4 socket Haswell with the new version of 1/3 (also below).
> >
> >I still need to update 3/3..
> >
> >Note that I think my reaim change wrecked the jobs/min calculation
> >somehow, as it keeps increasing. I do think however that the numbers are
> >comparable between runs, since they're wrecked the same way.
>
> The performance data for the 2 kernels were roughly the same. This was what
> I had been expecting as there was no change in algorithm in how the slowpath
> was being handled. So I was surprised by Jason's result yesterday showing
> such a big difference.

Its because the mutex wasn't quite exclusive enough :-) If you let in
multiple owner, like with that race you found, you get big gains in
throughput ...