Re: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks
From: Will Deacon
Date: Wed Jul 04 2018 - 08:13:19 EST
On Wed, Jul 04, 2018 at 04:28:52AM -0700, Paul E. McKenney wrote:
> On Tue, Jul 03, 2018 at 01:28:17PM -0400, Alan Stern wrote:
> > PS: Paul, is the patch which introduced rel-rf-acq-po currently present
> > in any of your branches? I couldn't find it.
>
> It is not, I will add it back in. I misinterpreted your "drop this
> patch" on 2/2 as "drop both patches". Please accept my apologies!
>
> Just to double-check, the patch below should be added, correct?
Hang on, I'm not sure this patch is quite right either. We need to reach
agreement on whether or not we want to support native RCpc acquire/release
instructions before we work out what to do with this relation.
Will
> ------------------------------------------------------------------------
>
> Date: Thu, 21 Jun 2018 13:26:49 -0400 (EDT)
> From: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> To: LKMM Maintainers -- Akira Yokosawa <akiyks@xxxxxxxxx>, Andrea Parri
> <andrea.parri@xxxxxxxxxxxxxxxxxxxx>, Boqun Feng
> <boqun.feng@xxxxxxxxx>, David Howells <dhowells@xxxxxxxxxx>,
> Jade Alglave <j.alglave@xxxxxxxxx>, Luc Maranget <luc.maranget@xxxxxxxx>,
> Nicholas Piggin <npiggin@xxxxxxxxx>,
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>,
> Will Deacon <will.deacon@xxxxxxx>
> cc: Kernel development list <linux-kernel@xxxxxxxxxxxxxxx>
> Subject: [PATCH 1/2] tools/memory-model: Change rel-rfi-acq ordering to
> (rel-rf-acq-po & int)
> Message-ID: <Pine.LNX.4.44L0.1806211315550.2381-100000@xxxxxxxxxxxxxxxxxxxx>
>
> This patch changes the LKMM rule which says that an acquire which
> reads from an earlier release must be executed after that release (in
> other words, the release cannot be forwarded to the acquire). This is
> not true on PowerPC, for example.
>
> What is true instead is that any instruction following the acquire
> must be executed after the release. On some architectures this is
> because a write-release cannot be forwarded to a read-acquire; on
> others (including PowerPC) it is because the implementation of
> smp_load_acquire() places a memory barrier immediately after the
> load.
>
> This change to the model does not cause any change to the model's
> predictions. This is because any link starting from a load must be an
> instance of either po or fr. In the po case, the new rule will still
> provide ordering. In the fr case, we also have ordering because there
> must be a co link to the same destination starting from the
> write-release.
>
> Signed-off-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
>
> ---
>
>
> [as1870]
>
>
> tools/memory-model/Documentation/explanation.txt | 35 ++++++++++++-----------
> tools/memory-model/linux-kernel.cat | 6 +--
> 2 files changed, 22 insertions(+), 19 deletions(-)
>
> Index: usb-4.x/tools/memory-model/linux-kernel.cat
> ===================================================================
> --- usb-4.x.orig/tools/memory-model/linux-kernel.cat
> +++ usb-4.x/tools/memory-model/linux-kernel.cat
> @@ -38,7 +38,7 @@ let strong-fence = mb | gp
> (* Release Acquire *)
> let acq-po = [Acquire] ; po ; [M]
> let po-rel = [M] ; po ; [Release]
> -let rfi-rel-acq = [Release] ; rfi ; [Acquire]
> +let rel-rf-acq-po = [Release] ; rf ; [Acquire] ; po
>
> (**********************************)
> (* Fundamental coherence ordering *)
> @@ -60,9 +60,9 @@ let dep = addr | data
> let rwdep = (dep | ctrl) ; [W]
> let overwrite = co | fr
> let to-w = rwdep | (overwrite & int)
> -let to-r = addr | (dep ; rfi) | rfi-rel-acq
> +let to-r = addr | (dep ; rfi)
> let fence = strong-fence | wmb | po-rel | rmb | acq-po
> -let ppo = to-r | to-w | fence
> +let ppo = to-r | to-w | fence | (rel-rf-acq-po & int)
>
> (* Propagation: Ordering from release operations and strong fences. *)
> let A-cumul(r) = rfe? ; r
> Index: usb-4.x/tools/memory-model/Documentation/explanation.txt
> ===================================================================
> --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt
> +++ usb-4.x/tools/memory-model/Documentation/explanation.txt
> @@ -1067,27 +1067,30 @@ allowing out-of-order writes like this t
> violating the write-write coherence rule by requiring the CPU not to
> send the W write to the memory subsystem at all!)
>
> -There is one last example of preserved program order in the LKMM: when
> -a load-acquire reads from an earlier store-release. For example:
> +There is one last example of preserved program order in the LKMM; it
> +applies to instructions po-after a load-acquire which reads from an
> +earlier store-release. For example:
>
> smp_store_release(&x, 123);
> r1 = smp_load_acquire(&x);
> + WRITE_ONCE(&y, 246);
>
> If the smp_load_acquire() ends up obtaining the 123 value that was
> -stored by the smp_store_release(), the LKMM says that the load must be
> -executed after the store; the store cannot be forwarded to the load.
> -This requirement does not arise from the operational model, but it
> -yields correct predictions on all architectures supported by the Linux
> -kernel, although for differing reasons.
> -
> -On some architectures, including x86 and ARMv8, it is true that the
> -store cannot be forwarded to the load. On others, including PowerPC
> -and ARMv7, smp_store_release() generates object code that starts with
> -a fence and smp_load_acquire() generates object code that ends with a
> -fence. The upshot is that even though the store may be forwarded to
> -the load, it is still true that any instruction preceding the store
> -will be executed before the load or any following instructions, and
> -the store will be executed before any instruction following the load.
> +written by the smp_store_release(), the LKMM says that the store to y
> +must be executed after the store to x. In fact, the only way this
> +could fail would be if the store-release was forwarded to the
> +load-acquire; the LKMM says it holds even in that case. This
> +requirement does not arise from the operational model, but it yields
> +correct predictions on all architectures supported by the Linux
> +kernel, although for differing reasons:
> +
> +On some architectures, including x86 and ARMv8, a store-release cannot
> +be forwarded to a load-acquire. On others, including PowerPC and
> +ARMv7, smp_load_acquire() generates object code that ends with a
> +fence. The result is that even though the store-release may be
> +forwarded to the load-acquire, it is still true that the store-release
> +(and all preceding instructions) will be executed before any
> +instruction following the load-acquire.
>
>
> AND THEN THERE WAS ALPHA
>
>