Re: [GIT PULL] Clang feature updates for v5.14-rc1

From: Nick Desaulniers
Date: Tue Jun 29 2021 - 17:06:02 EST

On Tue, Jun 29, 2021 at 6:14 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> Hi Kees,
> I thought the PGO stuff was on hold given Peter had open concerns, e.g.
> ... and there didn't seem to be a strong conclusion to the contrary.

Hi Mark,
If I could rephrase Peter's concerns in my own words to see if I
understood the intent correctly, I'd summarize the concerns as:
1. How does instrumentation act in regards to noinstr?

2. How much of this code can be reused with GCC?

3. Can we avoid proliferation of compiler specific code in the kernel?


Regarding point 1, I believe that was addressed by this series, which
Peter Ack'ed, and is based on work I did in LLVM based on Peter's
feedback, while collaborating with GCC developers on the semantics in
regards to inlining. I notice you weren't explicitly cc'ed on that
thread, that's my fault and I apologize. It wasn't intentional; once
a cc list as recommended by gets too long, I start
to forget who was on previous threads and might be interested in


Regarding point 2, I believe I addressed that in my response. Similar
to GCOV, we need the runtime hooks which are compiler specific in
order to capture the profiling data. Exporting such data to userspace
via sysfs can be easily shared though, as is done currently for GCOV.


Regarding point 3, I agree. There's currently 2 big places in the
kernel where we have very compiler specific code, IMO:
1. frame pointer based unwinding on 32b ARM (especially but not
limited to THUMB).
This series does ask to add a third.

At the same time, there are differences between compilers that are
unlikely to converge without great need. Compiler IR is generally not
interchangeable between compilers; the compiler runtimes (ie. symbols
typically provided by libgcc_s or compiler-rt) are (generally) tightly
coupled to their respective compilers. Since PGO relies on the
respective compiler runtimes, we wind up with compiler specific
runtime support for this feature. For a semi-freestanding environment
like the Linux kernel, that means duplicating the ABI for these
compiler runtime libraries, with additional code for kernel specific
synchronization, memory management, and data retrieval (sysfs).

Further, asking compiler vendors to break their existing ABIs with
their compiler runtimes to support a shared interface for profiling
data is also a hard sell. That's a major issue regarding frame pointer
based unwinding on 32b ARM as well; existing unwinders must change to
support the latest spec, yet not all code will be recompiled to match
it as the same time the unwinder support is added or updated. Unless
the compiler runtime was statically linked, then upgrading that shared
object might break binaries when they are run next. I'm not saying
it's impossible, but is it worth it? Do the compiler vendors agree?
~Nick Desaulniers