Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks
From: Jirka Hladky
Date: Tue Sep 04 2018 - 06:07:29 EST
Hi Mel,
thanks for sharing the background information! We will check if
2d4056fafa196e1ab4e7161bae4df76f9602d56d is causing the current
regression in 4.19 rc1 and let you know the outcome.
Jirka
On Tue, Sep 4, 2018 at 11:00 AM, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Sep 03, 2018 at 05:07:15PM +0200, Jirka Hladky wrote:
>> Resending in the plain text mode.
>>
>> > My own testing completed and the results are within expectations and I
>> > saw no red flags. Unfortunately, I consider it unlikely they'll be merged
>> > for 4.18. Srikar Dronamraju's series is likely to need another update
>> > and I would need to rebase my patches on top of that. Given the scope
>> > and complexity, I find it unlikely they would be accepted for an -rc,
>> > particularly this late of an rc. Whether we hit the 4.19 merge window or
>> > not will depend on when Srikar's series gets updated.
>>
>>
>> Hi Mel,
>>
>> we have collaborated back in July on the scheduler patch, improving
>> the performance by allowing faster memory migration. You came up with
>> the "sched-numa-fast-crossnode-v1r12" series here:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git
>>
>> which has shown good performance results both in your and our testing.
>>
>
> I remember.
>
>> Do you have some update on the latest status? Is there any plan to
>> merge this series into 4.19 kernel? We have just tested 4.19.0-0.rc1.1
>> and based on the results it seems that the patch is not included (and
>> I don't see it listed in git shortlog v4.18..v4.19-rc1
>> ./kernel/sched)
>>
>
> Srikar's series that mine depended upon was only partially merged due to
> a review bottleneck. He posted a v2 but it was during the merge window
> and likely will need a v3 to avoid falling through the cracks. When it
> is merged, I'll rebase my series on top and post it. While I didn't
> check against 4.19-rc1, I did find that rebasing on top of the partial
> series in 4.18 did not have as big an improvement.
>
>> With 4.19rc1 we see performance drop
>> * up to 40% (NAS bench) relatively to 4.18 + sched-numa-fast-crossnode-v1r12
>> * up to 20% (NAS, Stream, SPECjbb2005, SPECjvm2008) relatively to 4.18 vanilla
>> The performance is dropping. It's quite unclear what are the next
>> steps - should we wait for "sched-numa-fast-crossnode-v1r12" to be
>> merged or should we start looking at what has caused the drop in
>> performance going from 4.19rc1 to 4.18?
>>
>
> Both are valid options. If you take the latter option, I suggest looking
> at whether 2d4056fafa196e1ab4e7161bae4df76f9602d56d is the source of the
> issue as at least one auto-bisection found that it may be problematic.
> Whether it is an issue or not depends heavily on the number of threads
> relative to a socket size.
>
> --
> Mel Gorman
> SUSE Labs