Re: [PATCH v4 00/11] Rework READ_ONCE() to improve codegen
From: Marco Elver
Date:  Fri Apr 24 2020 - 11:54:32 EST
On Fri, 24 Apr 2020 at 15:42, Will Deacon <will@xxxxxxxxxx> wrote:
>
> Hi Peter,
>
> [+KCSAN folks]
>
> On Wed, Apr 22, 2020 at 01:26:27PM +0100, Will Deacon wrote:
> > On Wed, Apr 22, 2020 at 01:37:21PM +0200, Peter Zijlstra wrote:
> > > On Wed, Apr 22, 2020 at 09:18:39AM +0100, Will Deacon wrote:
> > > > On Tue, Apr 21, 2020 at 11:42:56AM -0700, Linus Torvalds wrote:
> > > > > On Tue, Apr 21, 2020 at 8:15 AM Will Deacon <will@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > It's me again. This is version four of the READ_ONCE() codegen improvement
> > > > > > patches [...]
> > > > >
> > > > > Let's just plan on biting the bullet and do this for 5.8. I'm assuming
> > > > > that I'll juet get a pull request from you?
> > > >
> > > > Sure thing, thanks. I'll get it into -next along with the arm64 bits for
> > > > 5.8, but I'll send it as a separate pull when the time comes. I'll also
> > > > include the sparc32 changes because otherwise the build falls apart and
> > > > we'll get an army of angry robots yelling at us (they seem to form the
> > > > majority of the active sparc32 user base afaict).
> > >
> > > So I'm obviously all for these patches; do note however that it collides
> > > most mighty with the KCSAN stuff, which I believe is still pending.
> >
> > That stuff has been pending for the last two releases afaict :/
> >
> > Anyway, I'm happy to either provide a branch with this series on, or do
> > the merge myself, or send this again based on something else. What works
> > best for you? The only thing I'd obviously like to avoid is tightly
> > coupling this to KCSAN if there's a chance of it missing the merge window
> > again.
>
> FWIW, I had a go at rebasing onto linux-next, just to get an idea for how
> bad it is. It's fairly bad, and I don't think it's fair to inflict it on
> sfr. I've included the interesting part of the resulting compiler.h below
> for you and the KCSAN crowd to take a look at (yes, there's room for
> subsequent cleanup, but I was focussing on the conflict resolution for now).
Thanks for the heads up. From what I can tell, your proposed change
may work fine for KCSAN. However, I've had trouble compiling this:
1. kcsan_disable_current() / kcsan_enable_current() do not work as-is,
because READ_ONCE/WRITE_ONCE seems to be used from compilation units
where the KCSAN runtime is not available (e.g.
arch/x86/entry/vdso/Makefile which had to set KCSAN_SANITIZE := n for
that reason).
2. Some new uaccess whitelist entries were needed.
I think this is what's needed:
https://lkml.kernel.org/r/20200424154730.190041-1-elver@xxxxxxxxxx
With that you can change the calls to __kcsan_disable_current() /
__kcsan_enable_current() for READ_ONCE() and WRITE_ONCE(). After that,
I was able to compile, and my test suite passed.
Thanks,
-- Marco
> So, I think the best bet is either for my changes to go into -tip on top
> of the KCSAN stuff, or for the KCSAN stuff to be dropped from -next (it's
> been there since at least January). Do you know if they are definitely
> supposed to be going in for 5.8?
>
> Any other ideas?
>
> Cheers,
>
> Will
>
> --->8
>
> /*
>  * Prevent the compiler from merging or refetching reads or writes. The
>  * compiler is also forbidden from reordering successive instances of
>  * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
>  * particular ordering. One way to make the compiler aware of ordering is to
>  * put the two invocations of READ_ONCE or WRITE_ONCE in different C
>  * statements.
>  *
>  * These two macros will also work on aggregate data types like structs or
>  * unions.
>  *
>  * Their two major use cases are: (1) Mediating communication between
>  * process-level code and irq/NMI handlers, all running on the same CPU,
>  * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
>  * mutilate accesses that either do not require ordering or that interact
>  * with an explicit memory barrier or atomic instruction that provides the
>  * required ordering.
>  */
> #include <asm/barrier.h>
> #include <linux/kasan-checks.h>
> #include <linux/kcsan-checks.h>
>
> /*
>  * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
>  * atomicity or dependency ordering guarantees. Note that this may result
>  * in tears!
>  */
> #define __READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
>
> #define __READ_ONCE_SCALAR(x)                                           \
> ({                                                                      \
>         typeof(x) *__xp = &(x);                                         \
>         kcsan_check_atomic_read(__xp, sizeof(*__xp));                   \
>         kcsan_disable_current();                                        \
>         ({                                                              \
>                 __unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp);     \
>                 kcsan_enable_current();                                 \
>                 smp_read_barrier_depends();                             \
>                 (typeof(x))__x;                                         \
>         });                                                             \
> })
>
> #define READ_ONCE(x)                                                    \
> ({                                                                      \
>         compiletime_assert_rwonce_type(x);                              \
>         __READ_ONCE_SCALAR(x);                                          \
> })
>
> #define __WRITE_ONCE(x, val)                                            \
> do {                                                                    \
>         *(volatile typeof(x) *)&(x) = (val);                            \
> } while (0)
>
> #define __WRITE_ONCE_SCALAR(x, val)                                     \
> do {                                                                    \
>         typeof(x) *__xp = &(x);                                         \
>         kcsan_check_atomic_write(__xp, sizeof(*__xp));                  \
>         kcsan_disable_current();                                        \
>         __WRITE_ONCE(*__xp, val);                                       \
>         kcsan_enable_current();                                         \
> } while (0)
>
> #define WRITE_ONCE(x, val)                                              \
> do {                                                                    \
>         compiletime_assert_rwonce_type(x);                              \
>         __WRITE_ONCE_SCALAR(x, val);                                    \
> } while (0)
>
> #ifdef CONFIG_KASAN
> /*
>  * We can't declare function 'inline' because __no_sanitize_address conflicts
>  * with inlining. Attempt to inline it may cause a build failure.
>  *     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
>  * '__maybe_unused' allows us to avoid defined-but-not-used warnings.
>  */
> # define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
> # define __no_sanitize_or_inline __no_kasan_or_inline
> #else
> # define __no_kasan_or_inline __always_inline
> #endif
>
> #define __no_kcsan __no_sanitize_thread
> #ifdef __SANITIZE_THREAD__
> /*
>  * Rely on __SANITIZE_THREAD__ instead of CONFIG_KCSAN, to avoid not inlining in
>  * compilation units where instrumentation is disabled. The attribute 'noinline'
>  * is required for older compilers, where implicit inlining of very small
>  * functions renders __no_sanitize_thread ineffective.
>  */
> # define __no_kcsan_or_inline __no_kcsan noinline notrace __maybe_unused
> # define __no_sanitize_or_inline __no_kcsan_or_inline
> #else
> # define __no_kcsan_or_inline __always_inline
> #endif
>
> #ifndef __no_sanitize_or_inline
> #define __no_sanitize_or_inline __always_inline
> #endif
>
> static __no_sanitize_or_inline
> unsigned long __read_once_word_nocheck(const void *addr)
> {
>         return __READ_ONCE(*(unsigned long *)addr);
> }
>
> /*
>  * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need to load a
>  * word from memory atomically but without telling KASAN/KCSAN. This is
>  * usually used by unwinding code when walking the stack of a running process.
>  */
> #define READ_ONCE_NOCHECK(x)                                            \
> ({                                                                      \
>         unsigned long __x = __read_once_word_nocheck(&(x));             \
>         smp_read_barrier_depends();                                     \
>         __x;                                                            \
> })
Unconditionally loading an unsigned long doesn't seem right, and might
also result in OOB reads.
> static __no_kasan_or_inline
> unsigned long read_word_at_a_time(const void *addr)
> {
>         kasan_check_read(addr, 1);
>         return *(unsigned long *)addr;
> }
>
> /**
>  * data_race - mark an expression as containing intentional data races
>  *
>  * This data_race() macro is useful for situations in which data races
>  * should be forgiven.  One example is diagnostic code that accesses
>  * shared variables but is not a part of the core synchronization design.
>  *
>  * This macro *does not* affect normal code generation, but is a hint
>  * to tooling that data races here are to be ignored.
>  */
> #define data_race(expr)                                                        \
>         ({                                                                     \
>                 typeof(({ expr; })) __val;                                     \
>                 kcsan_disable_current();                                       \
>                 __val = ({ expr; });                                           \
>                 kcsan_enable_current();                                        \
>                 __val;                                                         \
>         })