Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

From: Paul E. McKenney
Date: Fri Jul 13 2018 - 00:01:29 EST


On Thu, Jul 12, 2018 at 07:05:39PM -0700, Daniel Lustig wrote:
> On 7/12/2018 11:10 AM, Linus Torvalds wrote:
> > On Thu, Jul 12, 2018 at 11:05 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >>
> >> The locking pattern is fairly simple and shows where RCpc comes apart
> >> from expectation real nice.
> >
> > So who does RCpc right now for the unlock-lock sequence? Somebody
> > mentioned powerpc. Anybody else?
> >
> > How nasty would be be to make powerpc conform? I will always advocate
> > tighter locking and ordering rules over looser ones..
> >
> > Linus
>
> RISC-V probably would have been RCpc if we weren't having this discussion.
> Depending on how we map atomics/acquire/release/unlock/lock, we can end up
> producing RCpc, "RCtso" (feel free to find a better name here...), or RCsc
> behaviors, and we're trying to figure out which we actually need.
>
> I think the debate is this:
>
> Obviously programmers would prefer just to have RCsc and not have to figure out
> all the complexity of the other options. On x86 or architectures with native
> RCsc operations (like ARMv8), that's generally easy enough to get.
>
> For weakly-ordered architectures that use fences for ordering (including
> PowerPC and sometimes RISC-V, see below), though, it takes extra fences to go
> from RCpc to either "RCtso" or RCsc. People using these architectures are
> concerned about whether there's a negative performance impact from those extra
> fences.
>
> However, some scheduler code, some RCU code, and probably some other examples
> already implicitly or explicitly assume unlock()/lock() provides stronger
> ordering than RCpc.

Just to be clear, the RCU code uses smp_mb__after_unlock_lock() to get
the ordering that it needs out of spinlocks. Maybe that is what you
meant by "explicitly assume", but I figured I should clarify.

Thanx, Paul