Re: [PATCH 0/5] kernel hacking: GCC optimization for debug experience (-Og)
From: Du, Changbin
Date: Wed May 02 2018 - 05:18:04 EST
On Wed, May 02, 2018 at 09:33:15AM +0200, Ingo Molnar wrote:
>
> * changbin.du@xxxxxxxxx <changbin.du@xxxxxxxxx> wrote:
>
> > Comparison of system performance: a bit drop.
> >
> > w/o CONFIG_DEBUG_EXPERIENCE
> > $ time make -j4
> > real 6m43.619s
> > user 19m5.160s
> > sys 2m20.287s
> >
> > w/ CONFIG_DEBUG_EXPERIENCE
> > $ time make -j4
> > real 6m55.054s
> > user 19m11.129s
> > sys 2m36.345s
>
> Sorry, that's not a proper kbuild performance measurement - there's no noise
> estimation at all.
>
> Below is a description that should produce more reliable numbers.
>
> Thanks,
>
> Ingo
>
Thanks for your suggestion, I will try your tips to eliminate noise. Since it is
tested in KVM guest, so I just reboot the guest before testing. But in host side
I still need to consider these noises.
>
> =========================>
>
> So here's a pretty reliable way to measure kernel build time, which tries to avoid
> the various pitfalls of caching.
>
> First I make sure that cpufreq is set to 'performance':
>
> for ((cpu=0; cpu<120; cpu++)); do
> G=/sys/devices/system/cpu/cpu$cpu/cpufreq/scaling_governor
> [ -f $G ] && echo performance > $G
> done
>
> [ ... because it can be *really* annoying to discover that an ostensible
> performance regression was a cpufreq artifact ... again. ;-) ]
>
> Then I copy a kernel tree to /tmp (ramfs) as root:
>
> cd /tmp
> rm -rf linux
> git clone ~/linux linux
> cd linux
> make defconfig >/dev/null
>
> ... and then we can build the kernel in such a loop (as root again):
>
> perf stat --repeat 10 --null --pre '\
> cp -a kernel ../kernel.copy.$(date +%s); \
> rm -rf *; \
> git checkout .; \
> echo 1 > /proc/sys/vm/drop_caches; \
> find ../kernel* -type f | xargs cat >/dev/null; \
> make -j kernel >/dev/null; \
> make clean >/dev/null 2>&1; \
> sync '\
> \
> make -j16 >/dev/null
>
> ( I have tested these by pasting them into a terminal. Adjust the ~/linux source
> git tree and the '-j16' to your system. )
>
> Notes:
>
> - the 'pre' script portion is not timed by 'perf stat', only the raw build times
>
> - we flush all caches via drop_caches and re-establish everything again, but:
>
> - we also introduce an intentional memory leak by slowly filling up ramfs with
> copies of 'kernel/', thus continously changing the layout of free memory,
> cached data such as compiler binaries and the source code hierarchy. (Note
> that the leak is about 8MB per iteration, so it isn't massive.)
>
> With 10 iterations this is the statistical stability I get this on a big box:
>
> Performance counter stats for 'make -j128 kernel' (10 runs):
>
> 26.346436425 seconds time elapsed (+- 0.19%)
>
> ... which, despite a high iteration count of 10, is still surprisingly noisy,
> right?
>
> A 0.2% stddev is probably not enough to call a 0.7% regression with good
> confidence, so I had to use *30* iterations to make measurement noise to be about
> an order of magnitude lower than the effect I'm trying to measure:
>
> Performance counter stats for 'make -j128' (30 runs):
>
> 26.334767571 seconds time elapsed (+- 0.09% )
>
> i.e. "26.334 +- 0.023" seconds is a number we can have pretty high confidence in,
> on this system.
>
> And just to demonstrate that it's all real, I repeated the whole 30-iteration
> measurement again:
>
> Performance counter stats for 'make -j128' (30 runs):
>
> 26.311166142 seconds time elapsed (+- 0.07%)
>
--
Thanks,
Changbin Du