Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier (x86) (v12)

From: Peter Zijlstra
Date: Tue Mar 17 2015 - 12:38:33 EST


On Tue, Mar 17, 2015 at 01:13:36PM +0000, Mathieu Desnoyers wrote:
> > Its basically: WMB + ACQUIRE, which theoretically can leak a read in,
> > but nobody sane _delays_ reads, you want to speculate reads, not
> > postpone.
>
> If I believe the memory ordering table at
> https://en.wikipedia.org/wiki/Memory_ordering , there appears
> to be quite a few architectures that can reorder loads after loads,
> and loads after stores: Alpha, ARMv7, PA-RISC, SPARC RMO, x86 oostore
> and ia64. There may be subtle details that would allow us to
> do without the barriers in specific situations, but for that I'd
> very much like to hear what Paul has to say.

So I was starting to write that you can get load after load by one
speculating more than the other, but I suppose you can delay loads just
fine too.

Imagine getting a cache miss on a load, the OoO engine can then continue
execution until it hits a hard dependency, so you're effectively
delaying the load.

So yeah, if we want to be able to replace smp_rmb() with a
barrier+sys_membar() we need to promote the smp_mb__before_spinlock() to
smp_mb__after_unlock_lock() or so, that would only penalize PPC a bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/