RE: Build performance regressions originating from min()/max() macros

From: David Laight
Date: Wed Jul 24 2024 - 04:35:07 EST


From: Jürgen Groß
> Sent: 24 July 2024 09:14
>
> On 23.07.24 23:59, Lorenzo Stoakes wrote:
> > Arnd reported a significant build slowdown [0], which was bisected to the
> > series spanning commit 80fcac55385c ("minmax: relax check to allow
> > comparison between unsigned arguments and signed constants") to commit
> > 867046cc70277 ("minmax: relax check to allow comparison between unsigned
> > arguments and signed constants"), originating from the series "minmax:
> > Relax type checks in min() and max()." [1].
> >
> > I have reproduced this locally, reverting this series and manually fixing
> > up all call sites that invoke min()/max() for a simple x86-64 defconfig (+
> > some other debug flags I use for debug kernels, I can provide the .config
> > if needed).
> >
> > Arnd noted that the arch/x86/xen/setup.c file was particularly problematic,
> > taking 15 (!) seconds to pre-process on his machine, so I also enabled
> > CONFIG_XEN to test this and obtained performance numbers with this set/not
> > set.
> >
> > I was able to reproduce this very significant pre-processor time on this
> > file, noting that with the series reverted compile time for the file is
> > 0.79s, with it in place, it takes 6.90s for a 873.4% slowdown.
...
> > Which suggests a worryingly significant slowdown of ~45s with CONFIG_XEN
> > enabled and ~35s even without it.
> >
> > The underlying problems seems to be very large macro expansions, which Arnd
> > noted in the xen case originated from the line:
> >
> > extra_pages = min3(EXTRA_MEM_RATIO * min(max_pfn, PFN_DOWN(MAXMEM)),
> > extra_pages, max_pages - max_pfn);

Jeepers, that is definitely going to be big - it never was small.
The expansion of min() itself isn't that bad.
But both arguments get expanded a few times (and a few more than the old code).
So a nested min/max causes an O(n^2) expansion - livable.
But min3(a,b,c) is just min(min(a,b),c) - so a nested call.
(It seems to have been added to avoid the cost of the nested call, but
just implemented a nested call anyway.)
Here one of the arguments to min3() is a min() - and I suspect it goes
into the inner min() so O(n^3) - which is bad news.

> >
> > And resulted in the generation of 47 MB (!) of pre-processor output.
> >
> > It seems a lot of code now relies on the relaxed conditions of the newly
> > changed min/max() macros, so the question is - what can we do to address
> > these regressions?
>
> I can send a patch to simplify the problematic construct, but OTOH this
> will avoid only one particularly bad example.

I suspect that reversing the order of the args to min3() will help this case.

There is an updated patch set I sent in January that reduces the
expansion a bit.
I've a reworked version that removes the last few patches (removing
support for constant output and using MIN() for that instead).
I'll repost it and maybe Arnd will pick it up?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)