Re: [PATCH RFC LKMM 1/7] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

From: Alan Stern
Date: Thu Aug 30 2018 - 17:31:35 EST

On Thu, 30 Aug 2018, Andrea Parri wrote:

> > All the architectures supported by the Linux kernel (including RISC-V)
> > do provide this ordering for locks, albeit for varying reasons.
> > Therefore this patch changes the model in accordance with the
> > developers' wishes.
> >
> > Signed-off-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > Reviewed-by: Will Deacon <will.deacon@xxxxxxx>
> > Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Round 2 ;-), I guess... Let me start from the uncontroversial points:
> 1) being able to use the LKMM to reason about generic locking code
> is useful and desirable (paraphrasing Peter in [1]);
> 2) strengthening the ordering requirements of such code isn't going
> to boost performance (that's "real maths").
> This patch is taking (1) away from us and it is formalizing (2), with
> almost _no_ reason (no reason at all, if we stick to the commit msg.).

That's not quite fair. Generic code isn't always universally
applicable; some of it is opt-in -- meant only for the architectures
that can support it. In general, the LKMM allows us to reason about
higher abstractions (such as locking) at a higher level, without
necessarily being able to verify the architecture-specific details of
the implementations.

> In [2], Will wrote:
> "[...] having them [the RMWs] closer to RCsc[/to the semantics of
> locks] would make it easier to implement and reason about generic
> locking implementations (i.e. reduce the number of special ordering
> cases and/or magic barrier macros)"
> "magic barrier macros" as in "mmh, if we accept this patch, we _should_
> be auditing the various implementations/code to decide where to place a
> smp_barrier_promote_ordinary_release_acquire_to_unlock_lock()" ;-)
> or the like, and "special ordering cases" as in "arrgh, (otherwise) we
> are forced to reason on a per-arch basis while looking at generic code".

Currently the LKMM does not permit architecture-specific reasoning. It
would have to be extended (in a different way for each architecture)

For example, one could use herd's POWER model combined with the POWER
compilation scheme and the POWER-specific implementation of spinlocks
for such reasoning. The LKMM alone is not sufficient.

Sure, programming and reasoning about the kernel would be easier if all
architectures were the same. Unfortunately, we (and the kernel) have
to live in the real world.

> (Remark: ordinary release/acquire are building blocks for code such as
> qspinlock, (q)rwlock, mutex, rwsem, ... and what else??).

But are these building blocks used the same way for all architectures?

> To avoid further repetition, I conclude by confirming all the concerns
> and my assessment of this patch as pointed out in [3]; the subsequent
> discussion, although not conclusive, presented several suggestions for
> improvement (IMO).