Re: [PATCH RFC 0/3] Static calls

From: Ard Biesheuvel
Date: Mon Nov 12 2018 - 04:40:17 EST


On Mon, 12 Nov 2018 at 06:31, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
>
> On Mon, Nov 12, 2018 at 06:02:41AM +0100, Ingo Molnar wrote:
> >
> > * Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >
> > > On Fri, Nov 09, 2018 at 08:28:11AM +0100, Ingo Molnar wrote:
> > > > > - I'm not sure about the objtool approach. Objtool is (currently)
> > > > > x86-64 only, which means we have to use the "unoptimized" version
> > > > > everywhere else. I may experiment with a GCC plugin instead.
> > > >
> > > > I'd prefer the objtool approach. It's a pretty reliable first-principles
> > > > approach while GCC plugin would have to be replicated for Clang and any
> > > > other compilers, etc.
> > >
> > > The benefit of a plugin is that we'd only need two of them: GCC and
> > > Clang. And presumably, they'd share a lot of code.
> > >

Having looked into this, I don't think they will share any code at
all, to be honest. Perhaps some macros and string templates, that's
all.

> > > The prospect of porting objtool to all architectures is going to be much
> > > more of a daunting task (though we are at least already considering it
> > > for some arches).
> >
> > Which architectures would benefit from ORC support the most?
>
> According to my (limited and potentially flawed) knowledge, I think
> arm64 would benefit the most performance-wise, whereas powerpc and s390
> gains would be quite a bit less.
>

What would arm64 gain from ORC and/or objtool?

> We may have to port objtool to arm64 anyway, for live patching.

Is this about the reliable stack traces, i.e., the ability to detect
non-leaf functions that don't create stack frames? I think we should
be able to manage this without objtool on arm64 tbh.

> But
> that will be a lot more work than it took for Ard to write a GCC plugin.
>
> > I really think that hard reliance on GCC plugins is foolish
>
> Funny, I feel the same way about hard reliance on objtool :-)
>

I tend to agree here. I think objtool is a necessary evil (as are
compiler plugins, for that matter) which I hope does not spread to
other architectures.

But the main difference is that the GCC plugin is only ~50 lines (for
this particular use case, and minus another 50 lines of boilerplate),
whereas objtool (AIUI) duplicates lots and lots of functionality of
the compiler, assembler and/or linker, to mangle relocations, create
new sections etc etc. Porting this to other architectures is going to
be a major maintenance effort, especially when I think of, e.g.,
32-bit ARM with its Thumb2 quirks and other idiosyncrasies that are
currently hidden in the toolchain. Other architectures should be first
class citizens if objtool gains support for them, which means that the
x86 people that own it currently are on the hook for testing their
changes against architectures they are not familiar with.

This obviously applies equally to compiler plugins, but those have a
lot more focus.


> > - but maybe Clang's plugin infrastructure is a guarantee that it
> > remains a sane and usable interface.
>
> Hopefully so. If it breaks, we could always write another tool, as the
> work is straightforward. Or we could make it an objtool subcommand
> which works on all arches.
>
> > > > All other usecases are bonus, but it would certainly be interesting to
> > > > investigate the impact of using these APIs for tracing: that too is a
> > > > feature enabled everywhere but utilized only by a small fraction of Linux
> > > > users - so literally every single cycle or instruction saved or hot-path
> > > > shortened is a major win.
> > >
> > > With retpolines, and with tracepoints enabled, it's definitely a major
> > > win. Steve measured an 8.9% general slowdown on hackbench caused by
> > > retpolines.
> >
> > How much of that slowdown is reversed?
>
> In theory, it should reverse all of the slowdown, and actually may even
> speed it up a little. Steve is working on measuring that now.
>
> --
> Josh