Re: [ANNOUNCE] "Fast Kernel Headers" Tree -v2
From: Ingo Molnar
Date: Sat Jan 22 2022 - 04:18:42 EST
* Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > If we include comments & line-markers then the bloat goes up by another
> > ~2x:
> >
> > kepler:~/mingo.tip.git> ./st include/linux/sched.h
> > #include <linux/sched.h> | LOC: 2,186 | headers: 118
> > kepler:~/mingo.tip.git> ./st include/linux/sched.h
> > #include <linux/sched.h> | LOC: 4,092 | headers: 0
>
> The metric I've been focusing on is bytes of the preprocessed header,
> which is more sensitive to function definitions that get generated from
> macros, and I multiply this by the number of inclusions (from scanning
> the .file.o.cmd files). It probably helps to have a couple of metrics and
> look at all of them occasionally to not miss something important.
Actual inclusions don't just depend on .file.o.cmd files though, that won't
catch indirect inclusions, right?
> In the meantime, I have made some progress on reducing the headers
> for arm64, on top of your tree from Jan 8, but I have not looked at
> later changes from your side, and I need to work on this a bit more
> to ensure this doesn't break other architectures.
Sure & great!
> For an arm64 allmodconfig build, my additional improvements on top
> of yours are significant but not as good as I had hoped for, this
> can still improve I hope:
>
> 5.16-rc8-vanilla 32640 seconds user, 3286 seconds sys
> 5.16-rc8-mingo 22990 seconds user, 2304 seconds sys
> 5.16-rc8-arnd 19007 seconds user, 1853 seconds sys
~71% build throughput speedup for allmodconfig is very impressive to me. :-)
> As my tree builds any randconfig cleanly, [...]
Yeah, same here - having a few thousand randconfig build tests is normal
for each version:
/* This file is auto generated, version 3288 */
#define UTS_MACHINE "x86_64"
#define UTS_VERSION "#3288 Fri Jan 14 18:20:14 CET 2022"
My testing is mostly concentrated on x86 - but I often test ARM64
randconfig as well.
> I keep looking at different configs and find that this has a big impact,
> some options end up eliminating most of the benefits until I add further
> changes to clean up certain files. This happened with kasan, kprobes, and
> lse-atomics for instance. After eliminating all circular includes, I was
> also able to revisit my old script to visualize the inclusions, see[1]
> for the current arm64 defconfig output. This version uses my arbitrary
> metric as font-size, and uses labels for the number of inclusions.
This is really nice!
I was concentrating on optimizing a generic distro config - which doesn't
include the tons of extreme instrumentation measures that allmodconfig
includes but production distro kernels rarely do.
allmodconfig definitely needs more work, but 71% is a pretty good starting
point ...
Feel free to send in patches, I can help with the testing too.
Thanks,
Ingo