RE: objtool warning "uses BP as a scratch register" with clang-9

From: David Laight
Date: Mon Sep 02 2019 - 05:02:19 EST


From: Josh Poimboeuf
> Sent: 30 August 2019 17:49
> On Fri, Aug 30, 2019 at 08:48:49AM -0700, Linus Torvalds wrote:
> > On Fri, Aug 30, 2019 at 8:02 AM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > >
> > > For KASAN, the Clang threshold for inserting memset() is *2* consecutive
> > > writes instead of 17. Isn't that likely to cause tearing-related
> > > surprises?
> >
> > Tearing isn't likely to be a problem.
> >
> > It's not like memcpy() does byte-by-byte copies. If you pass it a
> > word-aligned pointer, it will do word-aligned accesses simply for
> > performance reasons.
> >
> > Even on x86, where we use "rep movsb", we (a) tend to disable it for
> > small copies and (b) it turns out that microcode that does the
> > optimized movsb (which is the only case we use it) probably ends up
> > doing atomic things anyway. Note the "probably". I don't have
> > microcode source code, but there are other indications like "we know
> > it doesn't take interrupts on a byte-per-byte level, only on the
> > cacheline level".
>
> The microcode argument is not all that comforting :-)
>
> Also what about unaligned accesses, e.g. if a struct member isn't on a
> word boundary? Arnd's godbolt link showed those can get combined too.

I'd guess that it has to 'complete' a partial copy.
After all there are no mid-instruction interrupt states so the interrupt
returns to a new 'rep movsb' instruction (the isr can change si/di/cx).
Either the source, or destination is almost certainly cache line aligned.

> I don't see x86 memcpy() doing any destination alignment checks.

I don't think anyone has tried to instrument whether it is better to
do misaligned reads or writes (and it probably depends on the cpu).
The code will probably be more critical on the reads.
The real gain will be when the source and destination have the same
mis-alignment.

...
> > So it's probably not an issue from a tearing standpoint - but it
> > worries me because of "this has to be a leaf function" kind of issues
> > where we may be using individual stores on purpose. We do have things
> > like that.
>
> It sounds like everybody's in agreement that replacing accesses with
> memset/memcpy is bad in a kernel context. Should we push for a new
> fine-grained compiler option to disable it?

I'm not sure it is a good idea in ANY context.
It seems like something the compiler people has discovered they can do
without actually deciding whether it is useful.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)