Re: [PATCH 6/24] make atomic_read() behave consistently on frv

From: Paul E. McKenney
Date: Tue Aug 14 2007 - 13:02:11 EST


On Tue, Aug 14, 2007 at 03:34:25PM +1000, Nick Piggin wrote:
> Paul E. McKenney wrote:
> >On Mon, Aug 13, 2007 at 01:15:52PM +0800, Herbert Xu wrote:
> >
> >>Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >>
> >>>On Sat, Aug 11, 2007 at 08:54:46AM +0800, Herbert Xu wrote:
> >>>
> >>>>Chris Snook <csnook@xxxxxxxxxx> wrote:
> >>>>
> >>>>>cpu_relax() contains a barrier, so it should do the right thing. For
> >>>>>non-smp architectures, I'm concerned about interacting with interrupt
> >>>>>handlers. Some drivers do use atomic_* operations.
> >>>>
> >>>>What problems with interrupt handlers? Access to int/long must
> >>>>be atomic or we're in big trouble anyway.
> >>>
> >>>Reordering due to compiler optimizations. CPU reordering does not
> >>>affect interactions with interrupt handlers on a given CPU, but
> >>>reordering due to compiler code-movement optimization does. Since
> >>>volatile can in some cases suppress code-movement optimizations,
> >>>it can affect interactions with interrupt handlers.
> >>
> >>If such reordering matters, then you should use one of the
> >>*mb macros or barrier() rather than relying on possibly
> >>hidden volatile cast.
> >
> >
> >If communicating among CPUs, sure. However, when communicating between
> >mainline and interrupt/NMI handlers on the same CPU, the barrier() and
> >most expecially the *mb() macros are gross overkill. So there really
> >truly is a place for volatile -- not a large place, to be sure, but a
> >place nonetheless.
>
> I really would like all volatile users to go away and be replaced
> by explicit barriers. It makes things nicer and more explicit... for
> atomic_t type there probably aren't many optimisations that can be
> made which volatile would disallow (in actual kernel code), but for
> others (eg. bitops, maybe atomic ops in UP kernels), there would be.
>
> Maybe it is the safe way to go, but it does obscure cases where there
> is a real need for barriers.

I prefer burying barriers into other primitives.

> Many atomic operations are allowed to be reordered between CPUs, so
> I don't have a good idea for the rationale to order them within the
> CPU (also loads and stores to long and ptr types are not ordered like
> this, although we do consider those to be atomic operations too).
>
> barrier() in a way is like enforcing sequential memory ordering
> between process and interrupt context, wheras volatile is just
> enforcing coherency of a single memory location (and as such is
> cheaper).

barrier() is useful, but it has the very painful side-effect of forcing
the compiler to dump temporaries. So we do need something that is
not quite so global in effect.

> What do you think of this crazy idea?
>
> /* Enforce a compiler barrier for only operations to location X.
> * Call multiple times to provide an ordering between multiple
> * memory locations. Other memory operations can be assumed by
> * the compiler to remain unchanged and may be reordered
> */
> #define order(x) asm volatile("" : "+m" (x))

There was something very similar discussed earlier in this thread,
with quite a bit of debate as to exactly what the "m" flag should
look like. I suggested something similar named ACCESS_ONCE in the
context of RCU (http://lkml.org/lkml/2007/7/11/664):

#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))

The nice thing about this is that it works for both loads and stores.
Not clear that order() above does this -- I get compiler errors when
I try something like "b = order(a)" or "order(a) = 1" using gcc 4.1.2.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/