On Fri, Jan 17, 2025 at 12:29:34PM +0100, Jonas Oberhauser wrote:
Am 1/17/2025 um 12:02 AM schrieb Alan Stern:
On Thu, Jan 16, 2025 at 08:08:22PM +0100, Jonas Oberhauser wrote:
I'm well aware that an absolute definition of semantic dependencies is not
easy to give.
In fact it's undecidable. No tool is ever going to be able to detect
semantic dependencies perfectly.
It depends.
Firstly, let's consider that the tool only runs on finite (or "finitized")
programs.
A program may be finite; that doesn't mean its executions are. Programs
can go into infinite loops. Are you saying that the tool would stop
verifying executions after (say) a billion steps? But then it would not
be able to detect semantic dependences in programs that run longer.
Seceondly, your definition depends on the compiler.
So in the sense that we don't know the compiler, it is undecidable.
But if you fix the compiler, you could still enumerate all executions under
that compiler and compute whether that compiler has a dependency or not.
You can't enumerate the executions unless you put an artificial bound on
the length of each execution, as noted above.
But as I mentioned before, I think you can define semantic dependencies more
appropriately because you don't really need to preserve semantic
dependencies in C++, and in LKMM (and VMM) you have volatiles that restrict
what kind of dependency-eliminations the compiler can do.
What makes you think this "more appropriate" definition of semantic
dependency will be any easier to detect than the original one?
Yes. Actually I wonder if you put this "all loads are volatile" restriction,
can a globally analysing compiler still have any optimizations that a
locally analysing compiler can not?
Yes, although whether they are pertinent is open to question. For
example, a globally optimizing compiler may observe that no thread ever
reads the value of a particular shared variable and then eliminate that
variable.
It rather seems the other way, that the locally analysing quasi-volatile
compiler can at least to some local optimizations, while the global volatile
compiler can not (e.g., x=1; y=x; can not be x=1; y=1; for the global
compiler because x is volatile now).
In the paper we speculate that it may be sufficient to require globally
optimizing compilers to treat atomics as quasi volatile with the added
restriction that loads must not be omitted.
But then how would you characterize semantic dependencies, if you want
to allow the definition to include some dependencies that aren't
semantic but not so many that you ever create a cycle?
I don't know which definition is correct yet, but the point is that you
don't have to avoid making so many dependencies that you would create a
cycle. It just forbids the compiler from looking for cycles and optimizing
based on the existance of the cycle. (Looking for unused vars and removing
those is still allowed, under the informal argument that this simulates an
execution where no OOTA happened)
At the moment, the only underlying ideas we have driving our notions of
semantic dependency is that a true semantic dependency should be one
which must give rise to order in the machine code. In turn, this order
prevents OOTA cycles from occuring during execution. That is the
essence of the paper.
Can your ideas be fleshed out to a comparable degree?
So for example, if we merge x = y || y = x, the merge can turn it into
x=y=x or y=x=y (and then into an empty program), but not into a cyclic
dependency. So even though one side of the dependency may be violated, for
sake of OOTA, we could still label both sides as dependent.
They _are_ both semantically dependent (in the original parallel
version, I mean). I don't see what merging has to do with it.
Note that I was considering the case where they are not volatile.
With a compiler that is not treating them as volatile, which merges the two
threads, under your definition, there is no semantic dependency in at least
one direction because there is no hardware realization H where you read
something else (of course you exclude such compilers, but I think
realistically they should be allowed).
My point is that we can say they are semantically dependent for the sake of
OOTA, not derive any ordering from these dependencies other than the
cyclical one, and therefore allow compilers to one of the two optimizations
(make x=y no longer depend on y or make y=x no longer depend on x) but not
make a cycle analysis to remove both dependencies and generate an OOTA value
(it can remove both dependencies by leaving x and y unchanged though).
I don't understand.
I do not like the idea of tying the definition of OOTA (which needs to
apply to every implementation) to a particular clang compiler.
But that is what you have done, no? Whether something is an sdep depends on
the compiler, so compiler A could generate an execution that is OOTA in the
sdep definition of compiler B.
Yes, but this does not tie the definition to _one particular_ compiler.
That is, we don't say "This dependency is semantic because of the way
GCC 14.2.1 handles it." Rather, we define for each compiler whether a
dependency is semantic for _that_ compiler.
So you either need a definition of semantic dependency that
a) applies in all cases we practically need and
b) is guaranteed by all compilers
or we need to live with the fact that we do not have a semantic dependency
definition that is independent of the compilation (even of the same
compiler) and need to do our verification for that specific compilation.
Add the qualification that the definition should be practical to
evaluate, and I agree.