Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

From: Alan Stern
Date: Mon Jan 23 2023 - 14:58:34 EST


On Mon, Jan 23, 2023 at 05:16:27PM +0100, Jonas Oberhauser wrote:
>
>
> On 1/19/2023 5:41 PM, Alan Stern wrote:
> > The point is that in P1, we have Write ->(gp;gp) Read, but we do not
> > have Write ->(gp';gp') Read. Only Write ->gp' Read. So if you're using
> > gp' instead of gp, you'll analyze the litmus test as if it had only one
> > grace period but two critical sections, getting a wrong answer.
>
> Are you writing about the old model? Otherwise I don't see how this can give
> a wrong answer.
> gp' isn't used to count the grace periods (anymore?). the po<=rcu-link
> allows using both grace periods to create rcu-order between the two read
> side critical sections.
> For the old model I believe it.

Yes, I was talking about the old version of the memory model.

> > If G is a grace period, let's write t1(G) for the time when G starts and
> > t2(G) for the time when G ends.
> >
> > Likewise, if C is a read-side critical section, let's write t2(C) for
> > the time when C starts (or the lock executes if you prefer) and t1(C)
> > for the time when C ends (or the unlock executes). This terminology
> > reflects the "backward" role that critical sections play in the memory
> > model.
> >
> > Now we can can characterize rcu-order and rcu-link in operational terms.
> > Let A and B each be either a grace period or a read-side critical
> > section. Then:
> >
> > A ->rcu-order B means t1(A) < t2(B), and
> >
> > A ->rcu-link B means t2(A) <= t1(B).
>
>
> That's a really elegant notation! I have thought about rcu-link and
> rcu-order as ordering ends or starts depending on which events are being
> ordered, but it quickly got out of hand because of all the different cases.
> With this notation it becomes quite trivial.
>
>
> > (Of course, we always have t1(X) < t2(X) for any grace period or
> > critical section X.)

Actually, it might make more sense to allow t1(C) = t2(C) for a critical
section C, because critical sections can be empty. Grace periods, by
contrast, always have to contain at least a full memory barrier.

> > This explains quite a lot. For example, we can justify including
> >
> > C ->rcu-link G
> >
> > into rcu-order as follows. From C ->rcu-link G we get that t2(C) <=
> > t1(G), in other words, C starts when or before G starts. Then the
> > Fundamental Law of RCU says that C must end before G ends, since
> > otherwise C would span all of G. Thus t1(C) < t2(G), which is C
> > ->rcu-order G.
> >
> > The case of G ->rcu-link C is similar.
> >
> > This also explains why rcu-link can be extended by appending (rcu-order
> > ; rcu-link)*.
>
> Indeed, by similar (but more clumsy) reasoning I observed that rcu-order can
> be thought of as "extending" rcu-link.
>
> > From X ->rcu-order Y ->rcu-link Z we get that t1(X) <
> > t2(Y) <= t1(Z) and thus t1(X) <= t1(Z). So if
> >
> > A ->rcu-link B ->(rcu-order ; rcu-link)* C
> >
> > then t2(A) <= t1(B) <= t1(C), which justifies A ->rcu-link C.
> >
> > The same sort of argument shows that rcu-order should be extendable by
> > appending (rcu-link ; rcu-order)* -- but not (rcu-order ; rcu-link)*.
> >
> > This also justifies why a lone gp belongs in rcu-order: G ->rcu-order G
> > holds because t1(G) < t2(G). But for critical sections we have t2(C) <
> > t1(C) and so C ->rcu-order C does not hold.
> I don't think that it justifies why it belongs there. It justifies that it
> could be included.
> Neither rcu-order nor rcu-link exactly capture the temporal ordering, they
> just imply it.
> For example, if you have L1 U1 and L2 U2 forming two read side critical
> sections C1 and C2, and
>     U1 ->(hb|pb)+ L2
> then I would say you would have
>     t1(C1) < t2(C2)
> but no rcu-order relation between any of the four events.

True, I should have said it suggests a reason for allowing rcu-order to
contain a lone gp.

> > Assuming ordinary memory accesses occur in a single instant, you see why
> > it makes sense to consider (po ; rcu-order ; po) an ordering.
>
> Do you mean "execute" in a single instant?

Yes, or to put it another way, t1(X) = t2(X) if X is a load or store.

> > But when you're comparing grace periods or critical sections to each other,
> > things get a little ambiguous. Should G1 be considered to come before
> > G2 when t1(G1) < t1(G2), when t2(G1) < t2(G2), or when t2(G1) < t1(G2)?
> > Springing for (po ; rcu-order ; po?) amounts to choosing the second
> > alternative.
>
> Aha, I see! Powerful notation indeed.
> Keeping that in mind, wouldn't it make sense for pb also be changed to
> `...;po?` ?

You mean changing the definition of pb to either:

prop ; strong-fence ; hb* ; po? ; [Marked]

or

prop ; strong-fence ; hb* ; [Marked] ; po? ; [Marked]

? Neither would be right. I'm sure you can easily come up with
examples of cycles in these relations, invalidating the propagation
axiom acyclic(pb).

rcu-fence is different because rcu-order has to begin and end with
either a grace period or a critical section, and both of these restrict
the execution order of surrounding events:

If X is a synchronize_rcu() or rcu_read_unlock() then events
po-before X must execute before X;

If X is a synchronize_rcu() or rcu_read_lock() then events
po-after X must execute after X.

The same cannot be said of hb or pb.

Alan