Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

From: Russell King - ARM Linux admin
Date: Tue Oct 01 2019 - 10:36:39 EST

Next message: Stephen Boyd: "Re: [PATCH v3 3/3] clk: qcom: Add Global Clock controller (GCC) driver for SC7180"
Previous message: David Laight: "RE: [PATCH] staging: rtl8723bs: hal: Fix memcpy calls"
In reply to: Andrew Murray: "Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly""
Next in thread: Andrew Murray: "Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Oct 01, 2019 at 12:41:30PM +0100, Andrew Murray wrote:
> On Tue, Oct 01, 2019 at 11:42:54AM +0100, Will Deacon wrote:
> > On Tue, Oct 01, 2019 at 06:40:26PM +0900, Masahiro Yamada wrote:
> > > On Mon, Sep 30, 2019 at 8:45 PM Will Deacon <will@xxxxxxxxxx> wrote:
> > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > > index 93d97f9b0157..c37c72adaeff 100644
> > > > --- a/lib/Kconfig.debug
> > > > +++ b/lib/Kconfig.debug
> > > > @@ -312,6 +312,7 @@ config HEADERS_CHECK
> > > >
> > > > config OPTIMIZE_INLINING
> > > > def_bool y
> > > > + depends on !(ARM || ARM64) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91111
> > >
> > >
> > > This is a too big hammer.
> >
> > It matches the previous default behaviour!
> >
> > > For ARM, it is not a compiler bug, so I am trying to fix the kernel code.
> > >
> > > For ARM64, even if it is a compiler bug, you can add __always_inline
> > > to the functions in question.
> > > (arch_atomic64_dec_if_positive in this case).
> > >
> > > You do not need to force __always_inline globally.
> >
> > So you'd prefer I do something like the diff below? I mean, it's a start,
> > but I do worry that we're hanging arch/arm/ out to dry.
>
> If I've understood one part of this issue correctly - and using the
> c2p_unsupported build failure as an example [1], there are instances in
> the kernel where it is assumed that the compiler will optimise out a call
> to an undefined function, and also assumed that the compiler will know
> at compile time that the function will never get called. It's common to
> satisfy this assumption when the calling function is inlined.
>
> But I suspect there may be other cases similar to c2p_unsupported which
> are still lurking.
>
> For example the following functions are called but non-existent, and thus
> may be an area worth investigating:
>
> __buggy_use_of_MTHCA_PUT, __put_dbe_unknown, __cmpxchg_wrong_size,
> __bad_percpu_size, __put_user_bad, __get_user_unknown,
> __bad_unaligned_access_size, __bad_xchg
>
> But more generally, as this is a common pattern - isn't there a benefit
> here for changing all of these to BUILD_BUG? (So they can be found easily).

Precisely, what is your suggestion?

If you think that replacing the call to __get_user_bad with BUILD_BUG(),
BUILD_BUG() becomes a no-op when __OPTIMIZE__ is not defined (see the
definition of __compiletime_assert() in linux/compiler.h); this means
such places will be reachable, which leads to uninitialised variables.

> Or to avoid this class of issues, change them to BUG or unreachable - but
> lose the benefit of compile time detection?

I think you ought to read the GCC manual wrt __builtin_unreachable().
"If control flow reaches the point of the `__builtin_unreachable',
the program is undefined. It is useful in situations where the
compiler cannot deduce the unreachability of the code."

I have seen cases where the instructions following an unreachable
code section have been the literal pool for the function - which,
if reached, would be quite confusing to debug. If you're lucky, you
might get an undefined instruction exception. If not, you could
continue and start executing another part of the function, leading
to possibly no crash at all - but unexpected results (which may end
up leaking sensitive data.)

For example, in our BUG() implementation on 32-bit ARM, we use
unreachable() after the asm() statement creating the bug table
entry and inserting the undefined instruction into the text.
Here's the resulting disassembly:

278: ebfffffe bl 0 <page_mapped>
278: R_ARM_CALL page_mapped
27c: e3500000 cmp r0, #0
280: 1a00006c bne 438 <invalidate_inode_pages2_range+0x3ac>
...
2d4: ebfffffe bl 0 <_raw_spin_lock_irqsave>
2d4: R_ARM_CALL _raw_spin_lock_irqsave
2d8: e5943008 ldr r3, [r4, #8]
2dc: e3130001 tst r3, #1
2e0: e1a02000 mov r2, r0
2e4: 1a000054 bne 43c <invalidate_inode_pages2_range+0x3b0>
...
438: e7f001f2 .word 0xe7f001f2
43c: e2433001 sub r3, r3, #1
440: eaffffa9 b 2ec <invalidate_inode_pages2_range+0x260>

Now, consider what unreachable() actually gets you here - it tells
the compiler that we do not expect to reach this point (that being
the point between 438 and 43c.) If we were to reach that point, we
would continue executing the code at 43c.

In this case, it would be like...

if (BUG_ON(page_mapped(page)))
goto random-location-in-xa_lock_irqsave()-inside-invalidate_complete_page2();

So no. unreachable() is not an option.

We really do want these places to be compile-time detected - relying
on triggering them at runtime is just not good enough IMHO (think
about how much testing the kernel would require to discover one of
these suckers buried in the depths of the code.)

Here's the question to ask: do we want to reliably detect issues
that we know are bad, which can lead to:
- unreliable kernel operation,
- user exploitable crashes,
or do we want to hide them for the sake of allowing -O0 compilation?

Given that the kernel as a general rule has very poor run-time test
coverage at the moment, I don't think this is the time to consider
giving up the protection that we have against this badness.

We've had several instances where these checks have triggered in the
user access code, and people have noticed when doing build tests.
They probably don't have the ability to do run-time testing on every
arch.

So, the existing facility of detecting these at build time is, IMHO,
an absolute must.

It would be different if the kernel community as a whole had the
ability to run-test every code path through the kernel source on
every arch, but I don't see that's a remotely realistic prospect.

If we want -O0 to work, but still want to preserve the ability to
detect these adequately, I think the easiest solution to that would
be to provide these dummy functions only when building with -O0,
making them all BUG().

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Next message: Stephen Boyd: "Re: [PATCH v3 3/3] clk: qcom: Add Global Clock controller (GCC) driver for SC7180"
Previous message: David Laight: "RE: [PATCH] staging: rtl8723bs: hal: Fix memcpy calls"
In reply to: Andrew Murray: "Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly""
Next in thread: Andrew Murray: "Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]