Re: [PATCH 00/13] [RFC] Rust support
From: comex
Date: Sat Apr 17 2021 - 01:24:29 EST
On Fri, Apr 16, 2021 at 4:24 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> Simlar thing for RCU; C11 can't optimally do that; it needs to make
> rcu_dereference() a load-acquire [something ARM64 has already done in C
> because the compiler might be too clever by half when doing LTO :-(].
> But it's the compiler needing the acquire semantics, not the computer,
> which is just bloody wrong.
You may already know, but perhaps worth clarifying:
C11 does have atomic_signal_fence() which is a compiler fence. But a
compiler fence only ensures the loads will be emitted in the right
order, not that the CPU will execute them in the right order. CPU
architectures tend to guarantee that two loads will be executed in the
right order if the second one's address depends on the first one's
result, but a dependent load can stop being dependent after compiler
optimizations involving value speculation. Using a load-acquire works
around this, not because it stops the compiler from performing any
optimization, but because it tells the computer to execute the loads
in the right order *even if* the compiler has broken the value
dependence.
So C11 atomics don't make the situation worse, compared to Linux's
atomics implementation based on volatile and inline assembly. Both
are unsound in the presence of value speculation. C11 atomics were
*supposed* to make the situation better, with memory_order_consume,
which would have specifically forbidden the compiler from performing
value speculation. But all the compilers punted on getting this to
work and instead just implemented memory_order_consume as
memory_order_acquire.
As for Rust, it compiles to the same LLVM IR that Clang compiles C
into. Volatile, inline assembly, and C11-based atomics: all of these
are available in Rust, and generate exactly the same code as their C
counterparts, for better or for worse. Unfortunately, the Rust
project has relatively limited muscle when it comes to contributing to
LLVM. So while it would definitely be nice if Rust could make RCU
sound, and from a specification perspective I think people would be
quite willing and probably easier to work with than the C committee...
I suspect that implementing this would require the kind of sweeping
change to LLVM that is probably not going to come from Rust.
There are other areas where I think that kind of discussion might be
more fruitful. For example, the Rust documentation currently says
that a volatile read racing with a non-volatile write (i.e. seqlocks)
is undefined behavior. [1] However, I am of the opinion that this is
essentially a spec bug, for reasons that are probably not worth
getting into here.
[1] https://doc.rust-lang.org/nightly/std/ptr/fn.read_volatile.html