Re: [RFC PATCH RT] rwsem: The return of multi-reader PI rwsems

From: Ingo Molnar
Date: Mon Apr 14 2014 - 05:56:13 EST



* Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> A while back ago I wrote a patch that would allow for reader/writer
> locks like rwlock and rwsems to have multiple readers in PREEMPT_RT. It
> was slick and fast but unfortunately it was way too complex and ridden
> with nasty little critters which earned me my large collection of
> frozen sharks in the fridge (which are quite tasty).
>
> The main problem with my previous solution was that I tried to be too
> clever. I worked hard on making the rw mutex still have the "fast
> path". That is, the cmpxchg that could allow a non contended grabbing
> of the lock be one instruction and be off with it. But to get that
> working required lots of tricks and black magic that was certainly
> going to fail. Thus, with the raining of sharks on my parade, the
> priority inheritance mutex with multiple owners died a slow painful
> death.
>
> So we thought.
>
> But over the years, a new darkness was on the horizon. Complaints about
> running highly threaded processes (did I hear Java?) were suffering
> huge performance hits on the PREEMPT_RT kernel. Whether or not the
> processes were real-time tasks, they still were horrible compared to
> running the same tasks on the mainline kernel. Note, this was being
> done on machines with many CPUs.
>
> The culprit mostly was a single rwsem, the notorious mmap_sem that
> can be taking several times for read, and as on RT, this is just a
> single mutex, and it would serialize these accesses that would not
> happen on mainline.
>
> I looked back at my poor dead rw multi pi reader patch and thought to
> myself. "How complex would this be if I removed the 'fast path' from
> the code". I decided to build a new tower in Mordor.
>
> I feel that I am correct. By removing the fast path and requiring all
> accesses to the rwsem to go through the slow path (must take the
> wait_lock to do anything). The code really wasn't that bad. I also only
> focused on the rwsem and did not worry about the rwlocks as that hasn't
> been pointed out as a bottle neck yet. If it does happen to be, this
> code could easily work on rwlocks too.
>
> I'm much more confident in this code than I was with my previous
> version of the rwlock multi-reader patch. I added a bunch of comments
> to this code to explain how things interact. The writer unlock was
> still able to use the fast path as the writers are pretty much like a
> normal mutex. Too bad that the writer unlock is not a high point of
> contention.
>
> This patch is built on top of the two other patches that I posted
> earlier, which should not be as controversial.
>
> If you have any benchmark on large machines I would be very happy if
> you could test this patch against the unpatched version of -rt.
>
> Cheers,
>
> -- Steve
>
> Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
> ---
> Index: linux-rt.git/kernel/rtmutex.c

Side note: could you please in general include diffstats with such
patches, especially since you seem to be exporting it from a Git repo?

Newfangled patch summaries like:

include/linux/rtmutex.h | 29 ++
include/linux/rwsem_rt.h | 8
include/linux/sched.h | 20 +
kernel/fork.c | 20 +
kernel/futex.c | 2
kernel/rt.c | 27 +
kernel/rtmutex.c | 645 +++++++++++++++++++++++++++++++++++++++++++++--
kernel/rtmutex_common.h | 19 +
kernel/sysctl.c | 13
9 files changed, 753 insertions(+), 30 deletions(-)

Really give a useful bird's eye view of forest Fangorn, before
straying into it!

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/