Re: LKMM: Making RMW barriers explicit

From: Jonas Oberhauser
Date: Thu May 23 2024 - 10:28:10 EST




Am 5/22/2024 um 6:54 PM schrieb Andrea Parri:
Alan, all,

("randomly" picking a recent post in the thread, after having observed
this discussion for a while...)

It would be better if there was a way to tell herd7 not to add the 'mb
tag to failed instructions in the first place. This approach is
brittle; see below.

AFAIU, changing the herd representation to generate mb-accesses in place
of certain mb-fences...

If you do want to use this approach, it should be simplified. All you
need is:

[M] ; po ; [RMW_MB]

[RMW_MB] ; po ; [M]

This is because events tagged with RMW_MB always are memory accesses,
and accesses that aren't part of the RMW are already covered by the
fencerel(Mb) thing above.

.. and updating the .cat file to the effects of something like

-let mb = ([M] ; fencerel(Mb) ; [M]) |
+let mb = (([M] ; po? ; [Mb] ; po? ; [M]) \ id) |

.. can hardly be called "making RMW barriers explicit". (So much so
that the first commit in PR #865 was titled "Remove explicit barriers
from RMWs". :-))

Overall, this discussion rather seems to confirm the close link between
tools/memory-model/ and herdtools7. (After all, to what extent could
any putative RMW_MB be considered "explicit" without _knowing the under-
lying representation of the RMW operations...)


My view is a bit different. There's more or less standard theory for weak memory models, including how operations at the source code level map to events in the graph.

Part of that standard theory are relations like rmw, and the circumstances under which they appear in the graph.

You'll see these relations in generic papers about weak memory models, like GenMC, Hahn, etc. as well as in pretty much every specific memory model like TSO, C11, PowerPC, VMM, etc., totally independently of herd7 (even though these notations have historically developed together with herd).

Naively I would expect that a tool like herd7 would be a generic WMM exploration tool, which follows these standards with a very obvious 1:1 mapping of what we see in the code and the events we see in the graph, plus perhaps a thin and explicit configurable herd7-specific mapping layer like what we see in linux_kernel.cat to configure the syntax for a specific model.

In that mapping layer, we currently see that xchg() is an exchange operation with an Mb tag. Just like an smp_load_acquire is a read with an acquire tag.
Or so it seems.

Instead, we find that contrary to what's written in that file, and contrary to the conventions, an xchg() is an F[mb] ; R[once] ; W[once] ; F[Mb]. And in fact a cmp_xchg could even be a R[once].

That's because the herd7 tool isn't quite as generic as one might think, but rather specifically "detects" that it's running the LKMM and then the mapping isn't what you'd naively think, but something hidden in the OCaml implementation of herd7.

That would be comparable to a popular tool for matrix calculations using the syntax
[ 10 5 4
3 4 2 ]

to define a 2x3 matrix, unless one of the values is -3, in which case it becomes a vector of 6 elements, because that's what a really important user of the tool wanted, and then didn't see enough of a need to change.

My point is that to anyone who has seen standard matrix notation, this syntax in a matrix-computation-tool looks like it would be a matrix, right? And it usually is a matrix; then it should always be a matrix.

I usually say I don't look much at natural language documentation and only read code, because code doesn't lie. In this case, the natural language documentation is saying the correct thing thanks to the hard work of a lot of people, but the code (in .cat etc.) doesn't do what it seems to be doing.

And I think that should be changed, both to reduce the anyways high mental load of reading the code without having to do mental transformations in your head, and also to make herd more Lkmm-agnostic.
In the ideal world, herd doesn't know anything about Lkmm except what we tell it through tools/memory-model, and generic things like C syntax + semantics.

So when using syntax like dom(rmw), I don't see it as confirming the close link to herd more than before, but rather depending more on standard notations and conventions, and relying on herd's close link to those standard notations (such as using rmw for the rmw relation and dom(r) for the domain of r) for dom(rmw) to mean what anyone who has deeply read a couple of modern WMM papers would expect.



My understanding is that
this discussion was at least in part motivated by a desire to experiment
and familiarize with the current herd representation

I would phrase it more extreme: I want to get rid of the unnecessary non-standard parts of the herd representation.

Those parts have led me astray several times. Ok, who cares about me, but even Luc once forgot about the non-standard parts and thought it is a bug:

https://github.com/herd/herdtools7/issues/384#issuecomment-1132859904

I also know that Viktor stumbled over it before and also suggested we change it.

I think there's 0 benefit to them, they're just wasting people's time and energy and lead to misunderstanding.

Ok, this e-mail became long. Hope it at least helps clarify my motivation :))

jonas