Re: [LKP] Re: [sched/hotplug] 2558aacff8: will-it-scale.per_thread_ops -1.6% regression
From: Peter Zijlstra
Date: Tue Dec 15 2020 - 03:38:41 EST
On Tue, Dec 15, 2020 at 01:35:46PM +0800, Xing Zhengjun wrote:
> On 12/11/2020 12:14 AM, Peter Zijlstra wrote:
> > On Thu, Dec 10, 2020 at 04:18:59PM +0800, kernel test robot wrote:
> > > FYI, we noticed a -1.6% regression of will-it-scale.per_thread_ops due to commit:
> > > commit: 2558aacff8586699bcd248b406febb28b0a25de2 ("sched/hotplug: Ensure only per-cpu kthreads run during hotplug")
> >
> > Mooo, weird but whatever. Does the below help at all?
>
> I test the patch
Thanks!
> , the regression reduced to -0.6%.
>
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor/ucode:
>
> lkp-cpl-4sp1/will-it-scale/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/100%/thread/sched_yield/performance/0x700001e
>
> commit:
> 565790d28b1e33ee2f77bad5348b99f6dfc366fd
> 2558aacff8586699bcd248b406febb28b0a25de2
> 4b26139b8db627a55043183614a32b0aba799d27 (this test patch)
>
> 565790d28b1e33ee 2558aacff8586699bcd248b406f 4b26139b8db627a55043183614a
> ---------------- --------------------------- ---------------------------
> %stddev %change %stddev %change %stddev
> \ | \ | \
> 4.011e+08 -1.6% 3.945e+08 -0.6% 3.989e+08 will-it-scale.144.threads
> 2785455 -1.6% 2739520 -0.6% 2769967 will-it-scale.per_thread_ops
> 4.011e+08 -1.6% 3.945e+08 -0.6% 3.989e+08 will-it-scale.workload
Well, that's better. But I'm rather confused now, because with this new
patch, the actual hot paths are identical, so I've no idea what is
actually causing the regression :/
The above numbers don't seem to have variance, how sure are we the
results are stable? The thing is, when I tried reproducing this locally,
I was mostly looking at noise.
> > ---
> > kernel/sched/core.c | 40 +++++++++++++++-------------------------
> > kernel/sched/sched.h | 13 +++++--------
> > 2 files changed, 20 insertions(+), 33 deletions(-)
Anyway, let me queue this in sched/urgent, it's simpler code and has
less regression.