Re: Memory-ordering recipes

From: Paul E. McKenney
Date: Thu Sep 21 2017 - 12:40:59 EST


On Thu, Sep 21, 2017 at 06:15:47PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 21, 2017 at 08:26:42AM -0700, Paul E. McKenney wrote:
> > ISA2 is the first one on page 2, and has this pattern of reads and
> > writes:
> >
> > CPU 0 CPU 1 CPU 2
> >
> > WRITE_ONCE(x, 1); r1 = READ_ONCE(y); r2 = READ_ONCE(z);
> > WRITE_ONCE(y, 1); WRITE_ONCE(z, 1); r3 = READ_ONCE(x);
> >
> > BUG_ON(r1 == 1 && r2 == 1 && r3 == 0);
> >
> > Arbitrary ordering can be added to all of these litmus-test patterns,
> > for example, the writes to y and z might become smp_store_release()
> > and the read from z might become smp_load_acquire(). Or, alternatively,
> > smp_mb() might be placed between accesses on all thread CPUs. The key
> > point is that "ISA2" identifies not a specific litmus test, but instead a
> > family of them with the same pattern of reads, writes and RMW operations,
> > but with different ordering properties.
> >
> > Z6.3 is the second one on page 2:
> >
> > CPU 0 CPU 1 CPU 2
> >
> > WRITE_ONCE(x, 2); r1 = READ_ONCE(y); r2 = READ_ONCE(z);
> > WRITE_ONCE(y, 1); WRITE_ONCE(z, 1); WRITE_ONCE(x, 1);
> >
> > BUG_ON(r1 == 1 && r2 == 1 && x == 2);
>
> But why are these useful to include in a recipes list? I would imagine
> those should cover the simple 2 threads stuff. Once you go fancy and
> need 3 CPUs I feel people had better know wth they're on about.

Release-acquire chains are good material for beginners, and they are
one of the few memory-ordering patterns that extend nicely to more than
two CPUs, hence the three-CPU examples.

> > LB is the last on on the extreme left of page 1. "LB" stands for
> > "load buffering", and each CPU's first access is a load and last
> > access is a store:
> >
> > CPU 0 CPU 1
> >
> > r1 = READ_ONCE(x); r2 = READ_ONCE(y);
> > WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
> >
> > BUG_ON(r1 == 1 && r2 == 1);
>
> > MP is the second on the extreme left of page 1. "MP" stands for "message
> > passing", and is used very heavily. The idea is that "x" is the message
> > (sent by CPU 0), and "y" is a flag saying that the message is ready to
> > be received (by CPU 1).
> >
> > CPU 0 CPU 1
> >
> > WRITE_ONCE(x, 1); r1 = READ_ONCE(y);
> > WRITE_ONCE(y, 1); r1 = READ_ONCE(x);
> >
> > BUG_ON(r1 == 1 && r2 == 0);
>
> Right, these two are fairly common patterns.
>
> > SB is the fourth on the extreme left of page 1. "SB" stands for "store
> > buffering" because systems without store buffers won't reorder this one.
> >
> > CPU 0 CPU 1
> >
> > WRITE_ONCE(x, 1); WRITE_ONCE(y, 1);
> > r1 = READ_ONCE(y); r2 = READ_ONCE(x);
> >
> > BUG_ON(r1 == 0 && r2 == 0);
> >
> > Does that help?
> >
> > Oh, and the actual recipes would include ordering as indicated by
> > the sub-bullets.
>
> Which just generates a terrible lot of noise. Why would people be
> interested in these permutations? Why not the minimal set that makes the
> guarantee?

Hmmm... Exactly what do you consider this minimal set to be?
Exactly which of the above scenarios would you leave out? Are there
other scenarios that you would want to include?

> Also, none of these cover 'simple' stuff like a ring-buffer.

Control dependencies are now 'simple'? ;-)

> So I have to ask, what is the purpose of this recipes list?

To give people common usage patterns to commit to memory, as an
alternative to running the tool every time they turn around. For the
people who generate code quickly, frequently running the tool would be
a huge productivity hit. But it is a probabilities game. If a given
usage pattern is rare enough, it is better to leave it to the tool.

Also, to guide code style, urging people to use straightforward designs
instead of the much more strange ones that they might otherwise
gravitate to.

Thanx, Paul