RE: [genirq] cbe16f35be: will-it-scale.per_thread_ops -5.2% regression

From: Song Bao Hua (Barry Song)
Date: Wed Apr 28 2021 - 03:01:41 EST




> -----Original Message-----
> From: Feng Tang [mailto:feng.tang@xxxxxxxxx]
> Sent: Wednesday, April 28, 2021 5:08 PM
> To: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: kernel test robot <oliver.sang@xxxxxxxxx>; Song Bao Hua (Barry Song)
> <song.bao.hua@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; LKML
> <linux-kernel@xxxxxxxxxxxxxxx>; lkp@xxxxxxxxxxxx; lkp@xxxxxxxxx;
> ying.huang@xxxxxxxxx; zhengjun.xing@xxxxxxxxx; x86@xxxxxxxxxx
> Subject: Re: [genirq] cbe16f35be: will-it-scale.per_thread_ops -5.2%
> regression
>
> Hi Thomas,
>
> On Tue, Apr 27, 2021 at 01:42:12PM +0200, Thomas Gleixner wrote:
> > Folks,
> >
> > On Tue, Apr 27 2021 at 17:00, kernel test robot wrote:
> >
> > > Greeting,
> > >
> > > FYI, we noticed a -5.2% regression of will-it-scale.per_thread_ops due to
> commit:
> > >
> > > commit: cbe16f35bee6880becca6f20d2ebf6b457148552 ("genirq: Add
> > > IRQF_NO_AUTOEN for request_irq/nmi()")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
> > > master
> >
> > this is the second report in the last week which makes not a lot of sense.
> > And this oneis makes absolutely no sense at all.
> >
> > This commit affects request_irq() and the related variants and has
> > exactly ZERO influence on anything related to that test case simply
> > because.
> >
> > I seriously have to ask the question whether this test infrastructure
> > is actually measuring what it claims to measure.
> >
> > As this commit clearly _cannot_ have the 'measured' side effect, this
> > points to some serious issue in the tests or the test infrastructure
> > itself.
>
> 0day has reported about 20 similar cases that the bisected commit has nothing
> to do with the benchmark case, and we were very confused too back then. And
> our debug showed many of them changed the code alignment of kernel data or text
> of other modules which is relevant with the benchmark, though some cases are
> not well explained yet. Following are links of some explained cases.
>
> https://lore.kernel.org/lkml/20200305062138.GI5972@shao2-debian/
> https://lore.kernel.org/lkml/20200330011254.GA14393@feng-iot/
> https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
>
> And to debug code alignment case, one debug patch to force all function address
> aligned to 32 bytes was merged in v5.9
>
> 09c60546f04f ./Makefile: add debug option to enable function aligned on 32 bytes
>
>
> For this particular case, the commit changes the code size of
> request_threaded_irq(), and many following functions' alignment are changed.
>

If so, the performance impact of code change would be random.

> So I extended the debug patch to force 64 bytes aligned, then this commit will
> cause _no_ performance change for the same test case on same platform.
>
> diff --git a/Makefile b/Makefile
>
> ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_32B
> -KBUILD_CFLAGS += -falign-functions=32
> +KBUILD_CFLAGS += -falign-functions=64
> endif
>
> Though I don't know the detail of how exactly this code alignment affects the
> case.

Guess it is related with icache.
But it is still an irrelevant problem.

>
> Thanks,
> Feng
>
> > Thanks,
> >
> > tglx

Thanks
Barry