Re: [4.17 regression] Performance drop on kernel-4.17 visible on Stream, Linpack and NAS parallel benchmarks

From: Mel Gorman
Date: Tue Sep 04 2018 - 05:01:00 EST


On Mon, Sep 03, 2018 at 05:07:15PM +0200, Jirka Hladky wrote:
> Resending in the plain text mode.
>
> > My own testing completed and the results are within expectations and I
> > saw no red flags. Unfortunately, I consider it unlikely they'll be merged
> > for 4.18. Srikar Dronamraju's series is likely to need another update
> > and I would need to rebase my patches on top of that. Given the scope
> > and complexity, I find it unlikely they would be accepted for an -rc,
> > particularly this late of an rc. Whether we hit the 4.19 merge window or
> > not will depend on when Srikar's series gets updated.
>
>
> Hi Mel,
>
> we have collaborated back in July on the scheduler patch, improving
> the performance by allowing faster memory migration. You came up with
> the "sched-numa-fast-crossnode-v1r12" series here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git
>
> which has shown good performance results both in your and our testing.
>

I remember.

> Do you have some update on the latest status? Is there any plan to
> merge this series into 4.19 kernel? We have just tested 4.19.0-0.rc1.1
> and based on the results it seems that the patch is not included (and
> I don't see it listed in git shortlog v4.18..v4.19-rc1
> ./kernel/sched)
>

Srikar's series that mine depended upon was only partially merged due to
a review bottleneck. He posted a v2 but it was during the merge window
and likely will need a v3 to avoid falling through the cracks. When it
is merged, I'll rebase my series on top and post it. While I didn't
check against 4.19-rc1, I did find that rebasing on top of the partial
series in 4.18 did not have as big an improvement.

> With 4.19rc1 we see performance drop
> * up to 40% (NAS bench) relatively to 4.18 + sched-numa-fast-crossnode-v1r12
> * up to 20% (NAS, Stream, SPECjbb2005, SPECjvm2008) relatively to 4.18 vanilla
> The performance is dropping. It's quite unclear what are the next
> steps - should we wait for "sched-numa-fast-crossnode-v1r12" to be
> merged or should we start looking at what has caused the drop in
> performance going from 4.19rc1 to 4.18?
>

Both are valid options. If you take the latter option, I suggest looking
at whether 2d4056fafa196e1ab4e7161bae4df76f9602d56d is the source of the
issue as at least one auto-bisection found that it may be problematic.
Whether it is an issue or not depends heavily on the number of threads
relative to a socket size.

--
Mel Gorman
SUSE Labs