Re: objtool warning "uses BP as a scratch register" with clang-9

From: Arnd Bergmann
Date: Thu Aug 29 2019 - 16:22:03 EST


On Thu, Aug 29, 2019 at 8:30 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Aug 29, 2019 at 10:35 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> So:
>
> - we do want "memcpy()" to become "__builtin_memcpy()" which can then
> be optimized to either individual inlined assignments _or_ to an
> out-of-line call to memcpy().
>
> - we do *not* want individual assignments to be randomly turned into
> memset/memcpy(), because of various different reasons (including
> function tracing, but also store tearing, yadda yadda)
>
> Conceptually, "-ffreestanding" is definitely what a kernel needs, but
> it has been *too* big of a hammer and disables real code generation,
> iirc.

I just tried passing just "-fno-builtin-memcpy -fno-builtin-memset".
This avoids going all the way to -ffreestanding and prevents the insertion
of unwanted memcpy and memset calls, but unfortunately (and
unsurprisingly) it also prevents the optimization of trivial memset calls.

On the other hand, I could not produce any trivial case like this without
CONFIG_KASAN, see https://godbolt.org/z/v440Qy

clang seems to behave similarly to gcc here, it will produce
calls to memset or memcpy when setting a lot of adjacent
members (17 for x86-clang, 29 for arm64 gcc), but not for two
or three of them. x86 gcc appears to always use string instructions
over memset().

Maybe we can just pass -fno-builtin-memcpy -fno-builtin-memset
for clang when CONFIG_KASAN is set and hope for the best?

Arnd