Re: [GIT PULL] Clang feature updates for v5.14-rc1

From: Linus Torvalds
Date: Tue Jun 29 2021 - 17:04:11 EST


On Tue, Jun 29, 2021 at 1:44 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> >
> > And it causes the kernel to be bigger and run slower.
>
> Right -- that's expected. It's not designed to be the final kernel
> someone uses. :)

Well, from what I've seen, you actually want to run real loads in
production environments for PGO to actually be anything but a bogus
"performance benchmarks only" kind of thing.

Of course, "performance benchmarks only" is very traditional, and
we've seen that used over and over in the past in this industry. That
doesn't make it _right_, though.

And if you actually want to have it usable in production environments,
you really should strive to run code as closely as possible to a
production kernel too.

You'd want to run something that you can sample over time, and in
production, not something that you have to build a special kernels for
that then gets used for a benchmark run, but can't be kept in
production because it performs so much worse.

Real proper profiles will tell you what *really* matters - and if you
don't have enough samples to give you good information, then that
particular code clearly is not important enough to waste PGO on.

This is not all that dissimilar to using gprof information for
traditional - manual - optimizations.

Sure, instrumented gprof output is better than nothing, but it is
*hugely* worse than actual proper sampled profiles that actually show
what matters for performance (as opposed to what runs a lot - the two
are not necessarily all that closely correlated, with cache misses
being a thing).

And I really hate how pretty much all of the PGO support seems to be
just about this inferior method of getting the data.

Linus