Re: [PATCH] Expand CPU compiler options

From: Borislav Petkov
Date: Sun Dec 09 2012 - 06:18:55 EST


On Sat, Dec 08, 2012 at 04:13:59AM -0800, John wrote:
> I tested the attached patch writtenÂby Andrà RamnitzÂusing three
> different machines running a generic x86-64 kernel and an otherwise
> identical kernel running with the optimized gcc options.Â
>
> Conclusion: There are small but real speed increases using a make
> endpoint to running with this patch.
>
> Details: 1) Three test machines: Intel Xeon X3360, Intel i7-2620M,
> Intel Core i7-3660K.
>
> 2) All ran the make benchmark (linked below) 35 times while booted
> into a 'generic' kernel. Then all ran the same make benchmark
> 35 times after booting into an optimized kernel. Below are the
> optimizations chosen for each machine. 2a) X3360 = core2 2b)
> i7-2620M = corei7-avx 2c) i7-3660K = core-avx-i 3) Analyzed
> resultingÂdistributionsÂfor statistical significance via ANOVA
> plots that clearly show statisticallyÂsignificantÂalbeit small
> differences.
>
> Links to ANOVA plots:
> http://s19.postimage.org/68urcofzn/corei7_avx.png
> http://s19.postimage.org/ozwomuak3/core_avx_i.png
>
> http://s19.postimage.org/d0l6fj4z7/core2.png
>
>
> References:
>
> Bash script that controls the
> benchmark:Âhttps://github.com/graysky2/bin/blob/master/bench
> Log file generated by script:
> http://repo-ck.com/bench/compile_time_optimization.txt.gz

Let's see, if I'm reading the log file correctly, the average values of
each test run differ by ~ 0.1 seconds tops.

For example, i7-3770K generic build gives on average 69.41404 while
the more optimized version 69.33554. The diff between the two is even
less than 0.1 second. The other two machines' diff is a bit higher. And
from looking at your graphs, this is all eaten up by stddev so I'd say
there aren't any improvements from using a different uarch target - just
noise. AFAICT, at least.

You could trace this same workload with perf as I told you originally to
see whether there are some other uarch benefits, and have a more precise
time measurement than using 'date' but I'm very sceptical.

In any case, these results are too marginal to warrant any code change
since they're basically disappearing in noise.

Btw, you might look into what optimizations exactly went into those
different compiler options - they might not be improving a lot, if at
all, performance-wise but be adding support for new instructions, etc,
etc, i.e. features which are not related to performance. And if that
is the case, there's no need for those different uarch build targets
in the kernel. Remember, the majority of linux kernels out there are
generic-x86_64 builds.

Thanks.

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/