Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen

From: Marco Elver
Date: Wed May 13 2020 - 14:54:20 EST


On Wed, 13 May 2020 at 19:47, Will Deacon <will@xxxxxxxxxx> wrote:
>
> On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 18:50, Will Deacon <will@xxxxxxxxxx> wrote:
> > >
> > > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > > On Wed, 13 May 2020 at 14:40, Will Deacon <will@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > > > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > > > > >
> > > > > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > > > > compiler-friendlier version).
> > > > > > >
> > > > > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > > > > now.
> > > > >
> > > > > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > > > > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > > > > __no_sanitize_or_inline doesn't seem to do anything.
> > > > >
> > > > > > Right, if/when people want sanitize crud enabled for x86 I need
> > > > > > something that:
> > > > > >
> > > > > > - can mark a function 'no_sanitize' and all code that gets inlined into
> > > > > > that function must automagically also not get sanitized. ie. make
> > > > > > inline work like macros (again).
> > > > > >
> > > > > > And optionally:
> > > > > >
> > > > > > - can mark a function explicitly 'sanitize', and only when an explicit
> > > > > > sanitize and no_sanitize mix in inlining give the current
> > > > > > incompatible attribute splat.
> > > > > >
> > > > > > That way we can have the noinstr function attribute imply no_sanitize
> > > > > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > > > > helper instead of __always_inline for __##func().
> > > > >
> > > > > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > > > > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > > > > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > > > > doesn't mislead people into using the function annotation to avoid KCSAN?
> > > > >
> > > > > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > > > > appreciate that's a noisier change.
> > > >
> > > > So far so good, except: both __no_sanitize_or_inline and
> > > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > > just doesn't avoid explicit kcsan_check calls, like those in
> > > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > > just because macros won't be redefined just for __no_sanitize
> > > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > > access is unchecked.
> > > >
> > > > This will have the expected result:
> > > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > > >
> > > > This will not work as expected:
> > > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); } // data
> > > > races are reported
> > >
> > > But the problem is that *this* does not work as expected:
> > >
> > > unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
> > > {
> > > return READ_ONCE_NOCHECK(*ptr);
> > > }
> > >
> > > which I think means that the function annotation is practically useless.
> >
> > Let me understand the problem better:
> >
> > - We do not want __tsan_func_entry/exit (looking at the disassembly,
> > these aren't always generated).
> > - We do not want kcsan_disable/enable calls (with the new __READ_ONCE version).
> > - We do *not* want the call to __read_once_word_nocheck if we have
> > __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> > to both KASAN and KCSAN.
>
> Sorry, I should've been more explicit. The code above, with KASAN enabled,
> compiles to:
>
> ffffffff810a2d50 <foo>:
> ffffffff810a2d50: 48 8b 07 mov (%rdi),%rax
> ffffffff810a2d53: c3 retq
>
> but with KCSAN enabled, compiles to:
>
> ffffffff8109ecd0 <foo>:
> ffffffff8109ecd0: 53 push %rbx
> ffffffff8109ecd1: 48 89 fb mov %rdi,%rbx
> ffffffff8109ecd4: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi
> ffffffff8109ecd9: e8 52 9c 1a 00 callq ffffffff81248930 <__tsan_func_entry>
> ffffffff8109ecde: 48 89 df mov %rbx,%rdi
> ffffffff8109ece1: e8 1a 00 00 00 callq ffffffff8109ed00 <__read_once_word_nocheck>
> ffffffff8109ece6: 48 89 c3 mov %rax,%rbx
> ffffffff8109ece9: e8 52 9c 1a 00 callq ffffffff81248940 <__tsan_func_exit>
> ffffffff8109ecee: 48 89 d8 mov %rbx,%rax
> ffffffff8109ecf1: 5b pop %rbx
> ffffffff8109ecf2: c3 retq
>
> Is that expected? There don't appear to be any more annotations to throw
> at it.

Right, so this is expected. We can definitely make
__tsan_func_entry/exit disappear with Clang, with GCC it's going to be
a while if we want to fix it.

If we remove 'noinline' from __no_kcsan_or_inline, we no longer get
the call to __read_once_word_nocheck above! But...

For KCSAN we force 'noinline' because older compilers still inline and
then instrument small functions even if we just have the no_sanitize
attribute (without inline mentioned). The same is actually true for
KASAN, so KASAN's READ_ONCE_NOCHECK might be broken in a few places,
but nobody seems to have noticed [1]. KASAN's __no_kasan_or_inline
should also have a 'noinline' I think. I just tested
__no_kcsan_or_inline without 'noinline', and yes, GCC 9 still decided
to inline a small function and then instrument the accesses.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600

The good news is that Clang does the right thing when removing
'noinline' from __no_kcsan_or_inline:
1. doesn't inline into functions that are instrumented, and
2. your above example doesn't do the call to __read_once_word_nocheck.

The obvious solution to this is: restrict which compiler we want to support?

> > From what I gather, we want to just compile the function as if the
> > sanitizer was never enabled. One reason for why this doesn't quite
> > work is because of the preprocessor.
> >
> > Note that the sanitizers won't complain about these accesses, which
> > unfortunately is all these attributes ever were documented to do. So
> > the attributes aren't completely useless. Why doesn't
> > K[AC]SAN_SANITIZE := n work?
>
> I just don't get the point in having a function annotation if you then have to
> pass flags at the per-object level. That also then necessitates either weird
> refactoring and grouping of code into "noinstrument.c" type files, or blanket
> disabling of instrumentation for things like arch/x86/

If you want a solution now, here is one way to get us closer to where
we want to be:

1. Peter's patch to add data_race around __READ_ONCE/__WRITE_ONCE.
2. Patch to make __tsan_func_entry/exit disappear with Clang.
3. Remove 'noinline' from __no_kcsan_or_inline.
4. Patch to warn users that KCSAN may have problems with GCC and
should use Clang >= 7.

But this is probably only half a solution.

If you *also* want to fix __READ_ONCE etc not adding calls to
__no_sanitize functions:

5. Remove any mention of data_race and kcsan_check calls from
__{READ,WRITE}_ONCE, {READ,WRITE}_ONCE. [Won't need #1 above.]
6. I'll send a patch to make KCSAN distinguish volatile accesses, and
we will require Clang 11.

That is *if* you insist on __no_sanitize to behave like you suggest.
Note that, at that point, I really don't know how to salvage GCC,
mainly because of fixing __no_sanitize with GCC looking hopeless.

Let me know what you prefer.

Thanks,
-- Marco