Re: [PATCH] arch: Introduce read_acquire()

From: Peter Zijlstra
Date: Wed Nov 12 2014 - 10:38:11 EST


On Wed, Nov 12, 2014 at 07:23:22AM -0800, Alexander Duyck wrote:
>
> On 11/12/2014 02:15 AM, Peter Zijlstra wrote:
> >On Tue, Nov 11, 2014 at 01:12:32PM -0800, Alexander Duyck wrote:
> >>>Minor nit on naming, but load_acquire would match what we do with barriers,
> >>>where you simply drop the smp_ prefix if you want the thing to work on UP
> >>>systems too.
> >>The problem is this is slightly different, load_acquire in my mind would use
> >>a mb() call, I only use a rmb(). That is why I chose read_acquire as the
> >>name.
> >acquire is not about rmb vs mb, do read up on
> >Documentation/memory-barriers.txt. Its a distinctly different semantic.
> >Some archs simply lack the means of implementing this semantics and have
> >to revert to mb (stronger is always allowed).
> >
> >Using the read vs load to wreck the acquire semantics is just insane.
>
> Actually I have been reading up on it as I wasn't familiar with C11.

C11 is _different_ although somewhat related.

> Most
> of what I was doing was actually based on the documentation in barriers.txt
> which was referring to memory operations not loads/stores when referring to
> the acquire/release so I assumed the full memory barrier was required. I
> wasn't aware that smp_load_acquire was only supposed to be ordering loads,
> or that smp_ store_release only applied to stores.

It does not.. an ACQUIRE is a semi-permeable barrier that doesn't allow
LOADs nor STOREs that are issued _after_ it to appear to happen _before_.
The RELEASE is the opposite number, it ensures LOADs and STOREs that are
issued _before_ cannot happen _after_.

This typically matches locking, where a lock (mutex_lock, spin_lock
etc..) have ACQUIRE semantics and the unlock RELEASE. Such that:

spin_lock();
a = 1;
b = x;
spin_unlock();

guarantees all LOADs (x) and STORESs (a,b) happen _inside_ the lock
region. What they do not guarantee is:


y = 1;
spin_lock()
a = 1;
b = x;
spin_unlock()
z = 4;

An order between y and z, both are allowed _into_ the region and can
cross there like:

spin_lock();
...
z = 4;
y = 1;
...
spin_unlock();


The only 'open' issue at the moment is if RELEASE+ACQUIRE := MB.
Currently we say this is not so, but Will (and me) would very much like
this to be so -- PPC64 being the only arch that actually makes this
distinction.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/