Re: [LKP] Re: [percpu_ref] 2b0d3d3e4f: reaim.jobs_per_min -18.4% regression

From: Xing, Zhengjun
Date: Mon Jan 18 2021 - 22:02:31 EST




On 1/11/2021 5:58 PM, Ming Lei wrote:
On Sun, Jan 10, 2021 at 10:32:47PM +0800, kernel test robot wrote:
Greeting,

FYI, we noticed a -18.4% regression of reaim.jobs_per_min due to commit:


commit: 2b0d3d3e4fcfb19d10f9a82910b8f0f05c56ee3e ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: reaim
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

runtime: 300s
nr_task: 100%
test: short
cpufreq_governor: performance
ucode: 0x5002f01

test-description: REAIM is an updated and improved version of AIM 7 benchmark.
test-url: https://sourceforge.net/projects/re-aim-7/

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------+
| testcase: change | vm-scalability: vm-scalability.throughput -2.8% regression |
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | cpufreq_governor=performance |
| | runtime=300s |
| | test=lru-file-mmap-read-rand |
| | ucode=0x5003003 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 14.5% improvement |
| test machine | 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=page_fault2 |
| | ucode=0x16 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -13.0% regression |
| test machine | 104 threads Skylake with 192G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=50% |
| | test=malloc1 |
| | ucode=0x2006906 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | vm-scalability: vm-scalability.throughput -2.3% regression |
| test machine | 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G memory |
| test parameters | cpufreq_governor=performance |
| | runtime=300s |
| | test=lru-file-mmap-read-rand |
| | ucode=0x5002f01 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | fio-basic: fio.read_iops -4.8% regression |
| test machine | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters | bs=4k |
| | cpufreq_governor=performance |
| | disk=2pmem |
| | fs=xfs |
| | ioengine=libaio |
| | nr_task=50% |
| | runtime=200s |
| | rw=randread |
| | test_size=200G |
| | time_based=tb |
| | ucode=0x5002f01 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.stackmmap.ops_per_sec -45.4% regression |
| test machine | 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory |
| test parameters | class=memory |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=10s |
| | ucode=0x5002f01 |
+------------------+---------------------------------------------------------------------------+
Just run a quick test of the last two on 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of
percpu_ref in fast path) and cf785af19319 ("block: warn if !__GFP_DIRECT_RECLAIM in bio_crypt_set_ctx()").

Not see difference in the two kernel(fio on null_blk with 224 hw queues,
and 'stress-ng --stackmmap-ops') on one 224 cores, dual sockets system.

BTW this patch itself doesn't touch fast path code, so it is supposed to
not affect performance.

Can you double check if the test itself is good?
I re-test the "fio-basic: fio.read_iops -4.8% regression"  for more than 5 times, the average regression is -2.3%.
For "stress_ng", normally, it tests a lot of cases one by one. Command  'stress-ng --stackmmap-ops' only test  "stackmmap" case.
I also tried only test "stackmmap" case, the regression is -7.8%.

But for here, it mainly reports "reaim.jobs_per_min -18.4% regression", I re-test  "reaim" case, the result is almost the same.

Note: cf785af19319 is 2b0d3d3e4fcf^



Thanks,
Ming
_______________________________________________
LKP mailing list -- lkp@xxxxxxxxxxxx
To unsubscribe send an email to lkp-leave@xxxxxxxxxxxx

--
Zhengjun Xing