Re: liburcu: LTO breaking rcu_dereference on arm64 and possibly other architectures ?
From: Paul E. McKenney
Date: Fri Apr 16 2021 - 16:01:50 EST
On Fri, Apr 16, 2021 at 03:30:53PM -0400, Mathieu Desnoyers wrote:
> ----- On Apr 16, 2021, at 3:02 PM, paulmck paulmck@xxxxxxxxxx wrote:
> [...]
> >
> > If it can be done reasonably, I suggest also having some way for the
> > person building userspace RCU to say "I know what I am doing, so do
> > it with volatile rather than memory_order_consume."
>
> Like so ?
>
> #define CMM_ACCESS_ONCE(x) (*(__volatile__ __typeof__(x) *)&(x))
> #define CMM_LOAD_SHARED(p) CMM_ACCESS_ONCE(p)
>
> /*
> * By defining URCU_DEREFERENCE_USE_VOLATILE, the user requires use of
> * volatile access to implement rcu_dereference rather than
> * memory_order_consume load from the C11/C++11 standards.
> *
> * This may improve performance on weakly-ordered architectures where
> * the compiler implements memory_order_consume as a
> * memory_order_acquire, which is stricter than required by the
> * standard.
> *
> * Note that using volatile accesses for rcu_dereference may cause
> * LTO to generate incorrectly ordered code starting from C11/C++11.
> */
>
> #ifdef URCU_DEREFERENCE_USE_VOLATILE
> # define rcu_dereference(x) CMM_LOAD_SHARED(x)
> #else
> # if defined (__cplusplus)
> # if __cplusplus >= 201103L
> # include <atomic>
> # define rcu_dereference(x) ((std::atomic<__typeof__(x)>)(x)).load(std::memory_order_consume)
> # else
> # define rcu_dereference(x) CMM_LOAD_SHARED(x)
> # endif
> # else
> # if (defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L)
> # include <stdatomic.h>
> # define rcu_dereference(x) atomic_load_explicit(&(x), memory_order_consume)
> # else
> # define rcu_dereference(x) CMM_LOAD_SHARED(x)
> # endif
> # endif
> #endif
Looks good to me!
Thanx, Paul