On Tue, Aug 14, 2007 at 03:34:25PM +1000, Nick Piggin wrote:
Maybe it is the safe way to go, but it does obscure cases where there
is a real need for barriers.
I prefer burying barriers into other primitives.
Many atomic operations are allowed to be reordered between CPUs, so
I don't have a good idea for the rationale to order them within the
CPU (also loads and stores to long and ptr types are not ordered like
this, although we do consider those to be atomic operations too).
barrier() in a way is like enforcing sequential memory ordering
between process and interrupt context, wheras volatile is just
enforcing coherency of a single memory location (and as such is
cheaper).
barrier() is useful, but it has the very painful side-effect of forcing
the compiler to dump temporaries. So we do need something that is
not quite so global in effect.
What do you think of this crazy idea?
/* Enforce a compiler barrier for only operations to location X.
* Call multiple times to provide an ordering between multiple
* memory locations. Other memory operations can be assumed by
* the compiler to remain unchanged and may be reordered
*/
#define order(x) asm volatile("" : "+m" (x))
There was something very similar discussed earlier in this thread,
with quite a bit of debate as to exactly what the "m" flag should
look like. I suggested something similar named ACCESS_ONCE in the
context of RCU (http://lkml.org/lkml/2007/7/11/664):
#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
The nice thing about this is that it works for both loads and stores.
Not clear that order() above does this -- I get compiler errors when
I try something like "b = order(a)" or "order(a) = 1" using gcc 4.1.2.