Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure

From: Bill Wendling
Date: Sat Jun 12 2021 - 16:57:55 EST


On Sat, Jun 12, 2021 at 1:25 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Sat, Jun 12, 2021 at 12:10:03PM -0700, Bill Wendling wrote:
> Yes it is, but is that sufficient in this case? It very much isn't for
> KASAN, UBSAN, and a whole host of other instrumentation crud. They all
> needed their own 'bugger-off' attributes.
>
> > > We've got KCOV and GCOV support already. Coverage is also not an
> > > argument mentioned anywhere else. Coverage can go pound sand, we really
> > > don't need a third means of getting that.
> > >
> > Those aren't useful for clang-based implementations. And I like to
> > look forward to potential improvements.
>
> I look forward to less things doing the same over and over. The obvious
> solution if of course to make clang use what we have, not the other way
> around.
>
That is not the obvious "solution".

> > > Do you have actual numbers that back up the sampling vs instrumented
> > > argument? Having the instrumentation will affect performance which can
> > > scew the profile just the same.
> > >
> > Instrumentation counts the number of times a branch is taken. Sampling
> > is at a gross level, where if the sampling time is fine enough, you
> > can get an idea of where the hot spots are, but it won't give you the
> > fine-grained information that clang finds useful. Essentially, while
> > sampling can "capture the hot spots very well", relying solely on
> > sampling is basically leaving optimization on the floor.
> >
> > Our optimizations experts here have determined, through data of
> > course, that instrumentation is the best option for PGO.
>
> It would be very good to post some of that data and explicit examples.
> Hear-say don't carry much weight.

Should I add measurements from waving a dead chicken over my keyboard?
I heard somewhere that that works as well. Or how about a feature that
hasn't been integrated yet, like using the perf tool apparently? I'm
sure that will be worth my time. You can't just come up with a
potential, unimplemented alternative (gcov is still a thing and not
using "perf") and expect people to dance to your tune.

I could give you numbers, but they would mean nothing to you, and I
suspect that you would reject them out of hand because it may not
benefit *everything*. The nature of FDO/PGO is that it's targeted to
specific tasks.

For example, Fangrui gave you numbers, and you rejected them out of
hand. I've explained to you why instrumentation is better than
sampling (at least for clang). Fangrui gave you numbers. Let's move on
to something else.

Now, for the "nointr" issue. I'll see if we need an additional change for that.

-bw