Re: LKMM: Making RMW barriers explicit
From: Jonas Oberhauser
Date: Thu May 23 2024 - 10:28:10 EST
Am 5/22/2024 um 6:54 PM schrieb Andrea Parri:
Alan, all,
("randomly" picking a recent post in the thread, after having observed
this discussion for a while...)
It would be better if there was a way to tell herd7 not to add the 'mb
tag to failed instructions in the first place. This approach is
brittle; see below.
AFAIU, changing the herd representation to generate mb-accesses in place
of certain mb-fences...
If you do want to use this approach, it should be simplified. All you
need is:
[M] ; po ; [RMW_MB]
[RMW_MB] ; po ; [M]
This is because events tagged with RMW_MB always are memory accesses,
and accesses that aren't part of the RMW are already covered by the
fencerel(Mb) thing above.
.. and updating the .cat file to the effects of something like
-let mb = ([M] ; fencerel(Mb) ; [M]) |
+let mb = (([M] ; po? ; [Mb] ; po? ; [M]) \ id) |
.. can hardly be called "making RMW barriers explicit". (So much so
that the first commit in PR #865 was titled "Remove explicit barriers
from RMWs". :-))
Overall, this discussion rather seems to confirm the close link between
tools/memory-model/ and herdtools7. (After all, to what extent could
any putative RMW_MB be considered "explicit" without _knowing the under-
lying representation of the RMW operations...)
My view is a bit different. There's more or less standard theory for
weak memory models, including how operations at the source code level
map to events in the graph.
Part of that standard theory are relations like rmw, and the
circumstances under which they appear in the graph.
You'll see these relations in generic papers about weak memory models,
like GenMC, Hahn, etc. as well as in pretty much every specific memory
model like TSO, C11, PowerPC, VMM, etc., totally independently of herd7
(even though these notations have historically developed together with
herd).
Naively I would expect that a tool like herd7 would be a generic WMM
exploration tool, which follows these standards with a very obvious 1:1
mapping of what we see in the code and the events we see in the graph,
plus perhaps a thin and explicit configurable herd7-specific mapping
layer like what we see in linux_kernel.cat to configure the syntax for a
specific model.
In that mapping layer, we currently see that xchg() is an exchange
operation with an Mb tag. Just like an smp_load_acquire is a read with
an acquire tag.
Or so it seems.
Instead, we find that contrary to what's written in that file, and
contrary to the conventions, an xchg() is an F[mb] ; R[once] ; W[once]
; F[Mb]. And in fact a cmp_xchg could even be a R[once].
That's because the herd7 tool isn't quite as generic as one might think,
but rather specifically "detects" that it's running the LKMM and then
the mapping isn't what you'd naively think, but something hidden in the
OCaml implementation of herd7.
That would be comparable to a popular tool for matrix calculations using
the syntax
[ 10 5 4
3 4 2 ]
to define a 2x3 matrix, unless one of the values is -3, in which case it
becomes a vector of 6 elements, because that's what a really important
user of the tool wanted, and then didn't see enough of a need to change.
My point is that to anyone who has seen standard matrix notation, this
syntax in a matrix-computation-tool looks like it would be a matrix,
right? And it usually is a matrix; then it should always be a matrix.
I usually say I don't look much at natural language documentation and
only read code, because code doesn't lie. In this case, the natural
language documentation is saying the correct thing thanks to the hard
work of a lot of people, but the code (in .cat etc.) doesn't do what it
seems to be doing.
And I think that should be changed, both to reduce the anyways high
mental load of reading the code without having to do mental
transformations in your head, and also to make herd more Lkmm-agnostic.
In the ideal world, herd doesn't know anything about Lkmm except what we
tell it through tools/memory-model, and generic things like C syntax +
semantics.
So when using syntax like dom(rmw), I don't see it as confirming the
close link to herd more than before, but rather depending more on
standard notations and conventions, and relying on herd's close link to
those standard notations (such as using rmw for the rmw relation and
dom(r) for the domain of r) for dom(rmw) to mean what anyone who has
deeply read a couple of modern WMM papers would expect.
My understanding is that
this discussion was at least in part motivated by a desire to experiment
and familiarize with the current herd representation
I would phrase it more extreme: I want to get rid of the unnecessary
non-standard parts of the herd representation.
Those parts have led me astray several times. Ok, who cares about me,
but even Luc once forgot about the non-standard parts and thought it is
a bug:
https://github.com/herd/herdtools7/issues/384#issuecomment-1132859904
I also know that Viktor stumbled over it before and also suggested we
change it.
I think there's 0 benefit to them, they're just wasting people's time
and energy and lead to misunderstanding.
Ok, this e-mail became long. Hope it at least helps clarify my
motivation :))
jonas