Re: Potential scheduler regression

From: Greg KH
Date: Mon Jul 10 2017 - 11:26:36 EST


On Mon, Jul 10, 2017 at 11:25:32AM +0200, Peter Zijlstra wrote:
> On Fri, Jul 07, 2017 at 04:55:27PM -0400, Ben Guthro wrote:
>
> > Apologies on the delay - it took a bit to get the machines, to run the test.
> >
> > I am happy to report that the kernel at 1ad3aaf3fcd2, seems to regain
> > performance loss from 1b568f0aab, in our test environment.
>
> Excellent.
>
> > Since 4.9 is an LTS kernel - is this appropriate to suggest to be
> > included in the linux-stable list?
>
> Hurm... so I typically suck at (also) keeping track of -stable things.
>
> But given LTS, there might be a few more commits that might make sense
> to include.
>
> This series corrects NUMA topology creation:
>
> 8c0334697dc3 ("sched/topology: Refactor function build_overlap_sched_groups()")
> c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")
> 0372dd2736e0 ("sched/topology: Fix building of overlapping sched-groups")
> 91eaed0d6131 ("sched/topology: Simplify build_overlap_sched_groups()")
> b0151c25548c ("sched/debug: Print the scheduler topology group mask")
> a420b0630362 ("sched/topology: Verify the first group matches the child domain")
> f32d782e31bf ("sched/topology: Optimize build_group_mask()")
> c20e1ea4b61c ("sched/topology: Move comment about asymmetric node setups")
> af85596c74de ("sched/topology: Remove FORCE_SD_OVERLAP")
> 73bb059f9b8a ("sched/topology: Fix overlapping sched_group_mask")
> 8d5dc5126bb2 ("sched/topology: Small cleanup")
> 005f874dd284 ("sched/topology: Add sched_group_capacity debugging")
> 1676330ecfa8 ("sched/topology: Fix overlapping sched_group_capacity")
>
> (there's a few more commits at the end of that series that add comments
> and renames a bunch of stuff which doesn't really fix anything).
>
> Cures a BUG_ON through sysrq:
>
> 896bbb252258 ("sched/core: Allow __sched_setscheduler() in interrupts when PI is not used")
>
>
> Performance issues:
>
>
> 502ce005ab95 ("sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs")
> a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")
>
> c249f255aab8 ("sched/rt: Minimize rq->lock contention in do_sched_rt_period_timer()")
>
> 8655d5497735 ("sched/numa: Use down_read_trylock() for the mmap_sem")
>
>
>
> And then the patch you want for this:
>
> 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")
>
>
>
> I have no real idea how much of any those qualify for 4.9, but know most
> of those patches ended up in the various enterprise distros in some form
> or other.

If people have experience with these in the "enterprise" distros, or
any other tree, and want to provide me with backported, and tested,
patches, I'll be glad to consider them for stable kernels.

thanks,

greg k-h