On Mon, Apr 19, 2021 at 09:53:06AM +0200, Paolo Bonzini wrote:
On 19/04/21 09:32, Peter Zijlstra wrote:
On Sat, Apr 17, 2021 at 04:51:58PM +0200, Paolo Bonzini wrote:
On 16/04/21 09:09, Peter Zijlstra wrote:
Well, the obvious example would be seqlocks. C11 can't do them
Sure it can. C11 requires annotating with (the equivalent of) READ_ONCE all
reads of seqlock-protected fields, but the memory model supports seqlocks
just fine.
How does that help?
IIRC there's two problems, one on each side the lock. On the write side
we have:
seq++;
smp_wmb();
X = r;
Y = r;
smp_wmb();
seq++;
Which C11 simply cannot do right because it does't have wmb.
It has atomic_thread_fence(memory_order_release), and
atomic_thread_fence(memory_order_acquire) on the read side.
https://godbolt.org/z/85xoPxeE5
void writer(void)
{
atomic_store_explicit(&seq, seq+1, memory_order_relaxed);
atomic_thread_fence(memory_order_acquire);
X = 1;
Y = 2;
atomic_store_explicit(&seq, seq+1, memory_order_release);
}
gives:
writer:
adrp x1, .LANCHOR0
add x0, x1, :lo12:.LANCHOR0
ldr w2, [x1, #:lo12:.LANCHOR0]
add w2, w2, 1
str w2, [x0]
dmb ishld
ldr w1, [x1, #:lo12:.LANCHOR0]
mov w3, 1
mov w2, 2
stp w3, w2, [x0, 4]
add w1, w1, w3
stlr w1, [x0]
ret
Which, afaict, is completely buggered. What it seems to be doing is
turning the seq load into a load-acquire, but what we really need is to
make sure the seq store (increment) is ordered before the other stores.