Re: [RFC] tools/memory-model: Rule out OOTA
From: Alan Stern
Date: Wed Jan 08 2025 - 13:47:20 EST
On Wed, Jan 08, 2025 at 06:33:05PM +0100, Jonas Oberhauser wrote:
>
>
> Am 1/7/2025 um 5:09 PM schrieb Alan Stern:
> > Is this really valid? In the example above, if there were no other
> > references to a or b in the rest of the program, the compiler could
> > eliminate them entirely.
>
> In the example above, this is not possible, because the address of a/b have
> escaped the function and are not deallocated before synchronization happens.
> Therefore the compiler must assume that a/b are accessed inside the compiler
> barrier.
I'm not quite sure what you mean by that, but if the compiler has access
to the entire program and can do a global analysis then it can recognize
cases where accesses that may seem to be live aren't really.
However, I admit this objection doesn't really apply to Linux kernel
programming.
> > (Whether the result could count as OOTA is
> > open to question, but that's not the point.) Is it not possible that a
> > compiler might find other ways to defeat your intentions?
>
> The main point (which I already mentioned in the previous e-mail) is if the
> object is deallocated without synchronization (or never escapes the function
> in the first place).
>
> And indeed, any such case renders the added rule unsound. It is in a sense
> unrelated to OOTA; cases where the load/store can be elided are never OOTA.
That is a matter of definition. In our paper, Paul and I described
instances of OOTA in which all the accesses have been optimized away as
"trivial".
> Of course that is outside the current scope of what herd7 needs to deal with
> or can express, because deallocation isn't a thing in herd7.
>
> Nevertheless, trying to specify inside cat when an access is "live" --
> relevant enough that the compiler will keep it around -- is hard and tedious
> (it's the main source of complication in the patch).
>
> A much better way would be to add a base set of "live loads and stores"
> Live, which are the loads and stores that the compiler must consider to be
> live. Just like addr, ctrl, etc., we don't have to specify these in cat, and
> rather rely on herd7 to correctly .
I agree that determining which accesses are live (in the sense that the
compiler cannot optimize them out of existence) accounts for a large
part of the difficulty of analyzing plain accesses in general, and OOTA
in particular.
> If an access interacts with an access of another thread (by reading from it
> or being read from it), it must be live.
This is the sort of approximation I'm a little uncomfortable with. It
would be better to say that a store which is read from by a live load
must be live. I don't see why a load which reads from a live store has
to be live.
> Then we could formulate the rule as
>
> +let to-w = (overwrite & int) | (addr | rmb ; [Live])? ; rwdep ; ([Live] ;
> wmb)?
>
> (the latter case being a generalization of the current `addr ; [Plain] ;
> wmb` and `rwdep` cases of to-w, assuming we restrict it Life accesses - it
> is otherwise also unsound:
>
> int a[2] = {0};
> int r1 = READ_ONCE(*x);
> a[r1] = 0; // compiler will just remove this
> smp_wmb();
> WRITE_ONCE(*y, 1);
>
Yes, and we recognize that this part of the rule is on shaky ground.
> )
>
> The formulation in the patch is just based on a complicated and close but
> imperfect approximation of Live.
Maybe you can reformulate the patch to make this more explicit.
In any case, it seems that any approximation we can make to Live will be
subject to various sorts of errors.
Alan