Re: LKMM: Making RMW barriers explicit

From: Hernan Ponce de Leon
Date: Thu May 23 2024 - 10:26:57 EST


On 5/23/2024 4:05 PM, Alan Stern wrote:
On Thu, May 23, 2024 at 02:54:05PM +0200, Jonas Oberhauser wrote:


Am 5/22/2024 um 4:20 PM schrieb Alan Stern:
It would be better if there was a way to tell herd7 not to add the 'mb
tag to failed instructions in the first place. This approach is
brittle; see below.

Hernan told me that in fact that is actually currently the case in herd7.
Failing RMW get assigned the Once tag implicitly.
Another thing that I'd suggest to change.

Indeed.

An alternative would be to have a way for the .cat file to remove the
'mb tag from a failed RMW instruction. But I don't know if this is
feasible.

For Mb it's feasible, as there is no Mb read or Mb store.

Mb = Mb & (~M | dom(rmw) | range(rmw))

However one would want to do the same for Acq and Rel.

For that one would need to distinguish e.g. between a read that comes from a
failed rmw instruction, and where the tag would disappear, or a normal
standalone read.

For example, by using two different acquire tags, 'acquire and 'rmw-acquire,
and defining

Acquire = Acquire | Rmw-acquire & (dom(rmw) | range(rmw))

Anyways we can do this change independently. So for now, we don't need
RMW_MB.

Overall, it seems better to have herd7 assign the right tag, but change
the way the .def file works so that it can tell herd7 which tag to use
in each of the success and failure cases.

I am not fully sure how herd7 uses the .def file, but I guess something like adding a second memory tag to __cmpxchg could work

cmpxchg(X,V,W) __cmpxchg{mb, once}(X,V,W)
cmpxchg_relaxed(X,V,W) __cmpxchg{once, once}(X,V,W)
cmpxchg_acquire(X,V,W) __cmpxchg{acquire, acquire}(X,V,W)
cmpxchg_release(X,V,W) __cmpxchg{release, release}(X,V,W)

Hernan


[M] ; po ; [RMW_MB]

[RMW_MB] ; po ; [M]

This is because events tagged with RMW_MB always are memory accesses,
and accesses that aren't part of the RMW are already covered by the
fencerel(Mb) thing above.

This has exactly the issue mentioned above - it will cause the rmw to have
an internal strong fence that on powerpc probably isn't there.

Oops, that's right. Silly oversight on my part. But at least you
understood what I meant.

We could do (with the assumption that Mb applies only to successful rmw):

[M] ; po ; [Mb & R]
[Mb & W] ; po ; [M]

That works.

Alan