Re: Documentation/memory-barriers.txt: How can READ_ONCE() and WRITE_ONCE() provide cache coherence?

From: Sergey Fedorov
Date: Mon Feb 29 2016 - 14:07:27 EST

On 28.02.2016 01:53, Paul E. McKenney wrote:
On Sat, Feb 27, 2016 at 11:13:00PM +0300, Sergey Fedorov wrote:
On 27.02.2016 00:31, Paul E. McKenney wrote:
Without READ_ONCE(), common sub-expression elimination optimizations
can cause later reads of a given variable to see older value than
previous reads did. For a (silly) example:

a = complicated_pure_function(x);
b = x;
c = complicated_pure_function(x);

The compiler is within its rights to transform this into the following:

a = complicated_pure_function(x);
b = x;
c = a(x);

In this case, the assignment to b might see a newer value of x than did
the later assignment to c. This violates cache coherence, which states
that all reads from a given variable must agree on the order of values
taken on by that variable.
I see how READ_ONCE() and WRITE_ONCE() can prevent compiler from
speculating on variable values and optimizing memory accesses. But
concerning cache coherency itself, my understanding is that software
can really ensure hardware cache coherency by using one of the
following methods:
- by not using the caches
- by using some sort of cache maintenance instructions
- by using hardware cache coherency mechanisms (which is what
normally used)

What kind of "cache coherency" do you mean?
All current systems supporting Linux guarantee that volatile accesses
to a given single variable will be seen in order, even when caches are
active, and without using any cache-coherence instructions. Note "a
given single variable". If there is more than one variable in play,
explicit memory ordering is required. The "volatile" is also important,
because the compiler (and in a few cases, the hardware) can reorder
non-volatile accesses.

Thank you for clarification. I think this was a bit confusing for me because I always think of cache coherence independent from high-level C objects like variables. For me, cache coherence is the behavior of system in response to CPU(s) making load/store operations to the same memory location.