Re: [PATCH 0000/2297] [ANNOUNCE, RFC] "Fast Kernel Headers" Tree -v1: Eliminate the Linux kernel's "Dependency Hell"
From: Ingo Molnar
Date: Sat Jan 08 2022 - 06:54:37 EST
* Nathan Chancellor <nathan@xxxxxxxxxx> wrote:
> On Tue, Jan 04, 2022 at 11:47:30AM +0100, Ingo Molnar wrote:
> > > > With the fast-headers kernel that's down to ~36,000 lines of code,
> > > > almost a factor of 3 reduction:
> > > >
> > > > # fast-headers-v1:
> > > > kepler:~/mingo.tip.git> wc -l kernel/pid.i
> > > > 35941 kernel/pid.i
> > >
> > > Coming from someone who often has to reduce a preprocessed kernel source
> > > file with creduce/cvise to report compiler bugs, this will be a very
> > > welcomed change, as those tools will have to do less work, and I can get
> > > my reports done faster.
> >
> > That's nice, didn't think of that side effect.
> >
> > Could you perhaps measure this too, to see how much of a benefit it is?
>
> As it turns out, I got an opportunity to measure this sooner rather than
> later [1]. Using cvise [2] with an identical set of toolchains and
> interestingness test [3], reducing net/core/skbuff.c took significantly
> less time with the version from the fast-headers tree.
>
> v5.16-rc8:
>
> $ wc -l skbuff.i
> 105135 skbuff.i
>
> $ time cvise test.fish skbuff.i
> ...
> ________________________________________________________
> Executed in 114.02 mins fish external
> usr time 1180.43 mins 69.29 millis 1180.43 mins
> sys time 229.80 mins 248.11 millis 229.79 mins
>
> fast-headers:
>
> $ wc -l skbuff.i
> 78765 skbuff.i
>
> $ time cvise test.fish skbuff.i
> ...
> ________________________________________________________
> Executed in 47.38 mins fish external
> usr time 620.17 mins 32.78 millis 620.17 mins
> sys time 123.70 mins 122.38 millis 123.70 mins
>
> I was not expecting that much of a difference but it somewhat makes
> sense, as the tool spends less time eliminated unused code and the
> compiler invocations will be incrementally quicker as the input becomes
> smaller.
Indeed, that's a +140% speedup in build performance, not bad. :-)
I also got around testing Clang (12) myself, and with my 'reference distro
config' I got these results:
#
# v5.16-rc8
#
Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs):
55,638,543,274,254 instructions # 0.77 insn per cycle ( +- 0.01% )
72,074,911,968,393 cycles # 3.901 GHz ( +- 0.04% )
18,490,451.51 msec cpu-clock # 54.740 CPUs utilized ( +- 0.04% )
337.788 +- 0.834 seconds time elapsed ( +- 0.25% )
#
# -fast-headers-v2-rc3
#
Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs):
30,904,130,243,855 instructions # 0.76 insn per cycle ( +- 0.02% )
40,703,482,733,690 cycles # 3.898 GHz ( +- 0.00% )
10,443,670.86 msec cpu-clock # 58.093 CPUs utilized ( +- 0.00% )
179.773 +- 0.829 seconds time elapsed ( +- 0.46% )
That's a +88% build speedup on Clang - even better than the +78% speedup on
GCC(-10).
Thanks,
Ingo