Re: [RFC] tools/memory-model: Rule out OOTA

From: Jonas Oberhauser
Date: Fri Jan 17 2025 - 11:46:25 EST




Am 1/17/2025 um 4:52 PM schrieb Alan Stern:
On Thu, Jan 16, 2025 at 06:02:18PM -0500, Alan Stern wrote:
On Thu, Jan 16, 2025 at 08:08:22PM +0100, Jonas Oberhauser wrote:
I would say that the approach with volatile is overzealous because it tries
to create a "local" order solution to the problem that only requires a
"global" ordering solution. Since not every semantic dependency needs to
provide order in C++ -- only the cycle of dependencies -- it is totally ok
to add too many semantic dependency edges to a program, even those that are
not going to be exactly maintained by every compiler, as long as we can
ensure that globally, no dependency cycle occurs.

But then how would you characterize semantic dependencies, if you want
to allow the definition to include some dependencies that aren't
semantic but not so many that you ever create a cycle? This sounds like
an even worse problem than we started with!

An interesting side comment on this issue...

This is a slight variation of the example on page 19 (section 4.3) of
the paper. (Pretend this is actually C++ code, the shared variables are
all atomic, and their accesses are all relaxed.)

bool x, y, z;

void P0(bool *x, bool *y, bool *z) {
bool r1, r2;

r1 = *x;
r2 = *y;

*z = (r1 != r2);
}

The paper points out that although there is an apparent semantic
dependency from the load of x to the store to z, if the compiler is
allowed not to handle atomics as quasi volatile then the dependency
can be broken. Nevertheless, I am not able to think of a program that
could exhibit OOTA as a result of breaking the semantic dependency. The
best I can come up with is this:

[P0 as above]

void P1(bool *x, bool *y, bool *z) {
bool r3;

r3 = z;
x = r3;
}

void P2(bool *x, bool *y, bool *z) {
y = true;
}

exists (x=true /\ z=true)

If P2 were not present, this result could not occur in any physical
execution, even if the dependency in P0 is broken. With P2 this result
isn't OOTA, even in executions where P0 ends up storing z before loading
x, because P2 could have executed first, then P0, then P1.

So perhaps this is an example of what you were talking about -- a
dependency which may or may not be semantic, but either way cannot lead
to OOTA.

Yes, that looks like an example of what I have in mind.

If at the model level we just say "yes there is a dependency, but no it does not give any ordering guarantee", then the compiler is still free to break the dependency like in your example.

A thread P3 { r1 = z; atomic_thread_fence(); r2 = y; }
could still observe r2 == false, r1 == true, "showing" that the dependency was broken.

This would not violate such a model.

(if we consider consume, then that would need to restrict the compiler from eliminating the dependency like this)

That is not to say that I am 100% sure that it is possible to define sdep correctly to make this work.
One problem is if the compiler merges two threads (with an OOTA cycle of 3+ threads), it can turn sdep;rfe;sdep;rfe into sdep;rfi;sdep;rfe.
If sdep is too naive then it is easy to come up with counter examples where this sdep;rfi;sdep no longer provides ordering, making the whole sdep;rfe cycle possible.

I am not sure if sdep can be formalized in a way that ensures that this sdep;rfi;sdep edge would still need to be preserved.

Of course one could have inter-thread semantic dependencies and only forbid
isdep ; rf (e?)
from being reflexive...

Best wishes,
jonas