Re: Kernel 4.7rc3 - Performance drop 30-40% for SPECjbb2005 and SPECjvm2008 benchmarks against 4.6 kernel

From: Jirka Hladky
Date: Thu Jun 16 2016 - 19:04:29 EST


> > we see performance drop 30-40% for SPECjbb2005 and SPECjvm2008
> Blergh, of course I don't have those.. :/

SPECjvm2008 is publicly available.
https://www.spec.org/download.html

We will prepare a reproducer and attach it to the BZ.

> What kind of config and userspace setup? Do you run this cruft in a
> cgroup of sorts?

No, we don't do any special setup except to control the number of threads.

Thanks for the hints which commits are most likely the root cause for
this. We will try to find the commit which has caused it.

Jirka



On Thu, Jun 16, 2016 at 7:22 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, Jun 16, 2016 at 06:38:50PM +0200, Jirka Hladky wrote:
>> Hello,
>>
>> we see performance drop 30-40% for SPECjbb2005 and SPECjvm2008
>
> Blergh, of course I don't have those.. :/
>
>> benchmarks starting from 4.7.0-0.rc0 kernel compared to 4.6 kernel.
>>
>> We have tested kernels 4.7.0-0.rc1 and 4.7.0-0.rc3 and these are as
>> well affected.
>>
>> We have observed the drop on variety of different x86_64 servers with
>> different configuration (different CPU models, RAM sizes, both with
>> Hyper Threading ON and OFF, different NUMA configurations (2 and 4
>> NUMA nodes)
>
> What kind of config and userspace setup? Do you run this cruft in a
> cgroup of sorts?
>
> If so, does it change anything if you run it in the root cgroup?
>
>> Linpack and Stream benchmarks do not show any performance drop.
>>
>> The performance drop increases with higher number of threads. The
>> maximum number of threads in each benchmark is the same as number of
>> CPUs.
>>
>> We have opened a BZ to track the progress:
>> https://bugzilla.kernel.org/show_bug.cgi?id=120481
>>
>> You can find more details along with graphs and tables there.
>>
>> Do you have any hints which commit should we try to reverse?
>
> There were only 66 commits or so, and I think we can rule out the
> hotplug changes, which should reduce it even further.
>
> You could see what the parent of this one does:
>
> 2159197d6677 sched/core: Enable increased load resolution on 64-bit kernels
>
> If not that, maybe the parent of:
>
> c58d25f371f5 sched/fair: Move record_wakee()
>
> After that I suppose you'll have to go bisect.
>