Re: [PATCH v4 0/3] locking/rwsem: Rwsem rearchitecture part 0

From: Will Deacon
Date: Mon Feb 18 2019 - 09:58:36 EST

Next message: YueHaibing: "[PATCH v2 -next] staging: rtl8192e: Remove set but not used variables 'broad_addr, stype'"
Previous message: Mathieu Desnoyers: "Re: BUG: optimized kprobes illegal instructions in v4.19 stable kernels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Feb 15, 2019 at 01:58:34PM -0500, Waiman Long wrote:
> On 02/15/2019 01:40 PM, Will Deacon wrote:
> > On Thu, Feb 14, 2019 at 11:37:15AM +0100, Peter Zijlstra wrote:
> >> On Wed, Feb 13, 2019 at 05:00:14PM -0500, Waiman Long wrote:
> >>> v4:
> >>> - Remove rwsem-spinlock.c and make all archs use rwsem-xadd.c.
> >>>
> >>> v3:
> >>> - Optimize __down_read_trylock() for the uncontended case as suggested
> >>> by Linus.
> >>>
> >>> v2:
> >>> - Add patch 2 to optimize __down_read_trylock() as suggested by PeterZ.
> >>> - Update performance test data in patch 1.
> >>>
> >>> The goal of this patchset is to remove the architecture specific files
> >>> for rwsem-xadd to make it easer to add enhancements in the later rwsem
> >>> patches. It also removes the legacy rwsem-spinlock.c file and make all
> >>> the architectures use one single implementation of rwsem - rwsem-xadd.c.
> >>>
> >>> Waiman Long (3):
> >>> locking/rwsem: Remove arch specific rwsem files
> >>> locking/rwsem: Remove rwsem-spinlock.c & use rwsem-xadd.c for all
> >>> archs
> >>> locking/rwsem: Optimize down_read_trylock()
> >> Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> >>
> >> with the caveat that I'm happy to exchange patch 3 back to my earlier
> >> suggestion in case Will expesses concerns wrt the ARM64 performance of
> >> Linus' suggestion.
> > Right, the current proposal doesn't work well for us, unfortunately. Which
> > was your earlier suggestion?
> >
> > Will
>
> In my posting yesterday, I showed that most of the trylocks done were
> actually uncontended. Assuming that pattern hold for the most of the
> workloads, it will not that bad after all.

That's fair enough; if you're going to sit in a tight trylock() loop like the
benchmark does, then you're much better off just calling lock() if you care
at all about scalability.

Will

Next message: YueHaibing: "[PATCH v2 -next] staging: rtl8192e: Remove set but not used variables 'broad_addr, stype'"
Previous message: Mathieu Desnoyers: "Re: BUG: optimized kprobes illegal instructions in v4.19 stable kernels"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]