Re: [signal] 4bad58ebc8: will-it-scale.per_thread_ops -3.3% regression

From: Thomas Gleixner
Date: Tue Apr 20 2021 - 14:35:15 EST


On Tue, Apr 20 2021 at 11:08, kernel test robot wrote:
> FYI, we noticed a -3.3% regression of will-it-scale.per_thread_ops due to commit:
>
> commit: 4bad58ebc8bc4f20d89cff95417c9b4674769709 ("signal: Allow tasks to cache one sigqueue struct")
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core
>
> in testcase: will-it-scale
> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
> with following parameters:
>
> nr_task: 100%
> mode: thread
> test: futex3
> cpufreq_governor: performance
> ucode: 0x5003006
>
> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
> test-url: https://github.com/antonblanchard/will-it-scale
> commit:
> 69995ebbb9 ("signal: Hand SIGQUEUE_PREALLOC flag to __sigqueue_alloc()")
> 4bad58ebc8 ("signal: Allow tasks to cache one sigqueue struct")
>
> 69995ebbb9d37173 4bad58ebc8bc4f20d89cff95417
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 1.273e+09 -3.3% 1.231e+09 will-it-scale.192.threads
> 6630224 -3.3% 6409738 will-it-scale.per_thread_ops
> 1.273e+09 -3.3% 1.231e+09 will-it-scale.workload
> 1638 ± 3% -7.8% 1510 ± 5% sched_debug.cfs_rq:/.runnable_avg.max
> 297.83 ± 68% +1747.6% 5502 ±152% interrupts.33:PCI-MSI.524291-edge.eth0-TxRx-2
> 297.83 ± 68% +1747.6% 5502 ±152% interrupts.CPU12.33:PCI-MSI.524291-edge.eth0-TxRx-2

This change is definitely not causing more network traffic

> 8200 -33.4% 5459 ± 35% interrupts.CPU27.NMI:Non-maskable_interrupts
> 8200 -33.4% 5459 ± 35% interrupts.CPU27.PMI:Performance_monitoring_interrupts
> 8199 -33.4% 5459 ± 35% interrupts.CPU28.NMI:Non-maskable_interrupts
> 8199 -33.4% 5459 ± 35% interrupts.CPU28.PMI:Performance_monitoring_interrupts
> 6148 ± 33% -11.2% 5459 ± 35% interrupts.CPU29.NMI:Non-maskable_interrupts
> 6148 ± 33% -11.2% 5459 ± 35% interrupts.CPU29.PMI:Performance_monitoring_interrupts
> 4287 ± 8% +33.6% 5730 ± 15% interrupts.CPU49.CAL:Function_call_interrupts
> 6356 ± 19% +49.6% 9509 ± 19% interrupts.CPU97.CAL:Function_call_interrupts

Neither does it increase the number of function calls

> 407730 ± 8% +37.5% 560565 ± 7% perf-stat.i.dTLB-load-misses
> 415959 ± 8% +40.4% 583928 ± 7% perf-stat.ps.dTLB-load-misses

And this massive increase does not make sense either.

Confused.

Thanks,

tglx