On Wed, Nov 12, 2014 at 07:23:22AM -0800, Alexander Duyck wrote:
On 11/12/2014 02:15 AM, Peter Zijlstra wrote:
On Tue, Nov 11, 2014 at 01:12:32PM -0800, Alexander Duyck wrote:
acquire is not about rmb vs mb, do read up onMinor nit on naming, but load_acquire would match what we do with barriers,The problem is this is slightly different, load_acquire in my mind would use
where you simply drop the smp_ prefix if you want the thing to work on UP
systems too.
a mb() call, I only use a rmb(). That is why I chose read_acquire as the
name.
Documentation/memory-barriers.txt. Its a distinctly different semantic.
Some archs simply lack the means of implementing this semantics and have
to revert to mb (stronger is always allowed).
Using the read vs load to wreck the acquire semantics is just insane.
Actually I have been reading up on it as I wasn't familiar with C11.
C11 is _different_ although somewhat related.
Most
of what I was doing was actually based on the documentation in barriers.txt
which was referring to memory operations not loads/stores when referring to
the acquire/release so I assumed the full memory barrier was required. I
wasn't aware that smp_load_acquire was only supposed to be ordering loads,
or that smp_ store_release only applied to stores.
It does not.. an ACQUIRE is a semi-permeable barrier that doesn't allow
LOADs nor STOREs that are issued _after_ it to appear to happen _before_.
The RELEASE is the opposite number, it ensures LOADs and STOREs that are
issued _before_ cannot happen _after_.
This typically matches locking, where a lock (mutex_lock, spin_lock
etc..) have ACQUIRE semantics and the unlock RELEASE. Such that:
spin_lock();
a = 1;
b = x;
spin_unlock();
guarantees all LOADs (x) and STORESs (a,b) happen _inside_ the lock
region. What they do not guarantee is:
y = 1;
spin_lock()
a = 1;
b = x;
spin_unlock()
z = 4;
An order between y and z, both are allowed _into_ the region and can
cross there like:
spin_lock();
...
z = 4;
y = 1;
...
spin_unlock();
The only 'open' issue at the moment is if RELEASE+ACQUIRE := MB.
Currently we say this is not so, but Will (and me) would very much like
this to be so -- PPC64 being the only arch that actually makes this
distinction.