Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression

From: Aaron Lu
Date: Wed Aug 17 2016 - 01:38:04 EST


On 08/17/2016 01:04 PM, Aaron Lu wrote:
> On 08/16/2016 05:56 PM, Xin Long wrote:
>>>>>
>>>>> I'm testing on Linus' master, can we all use that please?
>>>>>
>>>>
>>>> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
>>>>
>>>> [mechine]
>>>> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
>>>> mem 62G (66000220K)
>>>>
>>>> [system]
>>>> # cat /etc/redhat-release
>>>> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
>>>>
>>>> [commit 3684b03]
>>>> [root@hp-dl380pg8-11 lxin]# uname -r
>>>> 4.8.0-rc2.3684b03
>>>> [root@hp-dl380pg8-11 lxin]# cat test.sh
>>>> killall -0 netserver || netserver -4 &
>>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1
>>>
>>> I just realized the test we are doing is not exactly the same.
>>> As the original report says:
>>> ip: ipv4
>>> runtime: 300s
>>> nr_threads: 200%
>>> cluster: cs-localhost
>>> send_size: 10K
>>> test: SCTP_STREAM_MANY
>>> cpufreq_governor: performance
>>>
>>> Note the nr_threads: 200%, which means to start 2 times of CPU number
>>> processes of netperf.
>>>
>>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes
>>> are started concurrently:
>> OK, understand.
>>
>>>
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 &
>>>
>>> The throughput is the average of those runs.
>>>
>>> And I think we should be doing test on:
>>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one)
>>> and
>>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its immediate parent)
>>> instead of Linus' master HEAD to avoid other factors.
>>>
>> OK, I will do tests as your suggestion now, but need to rebuild again :D
>>
>> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0",
>> then try again?
>
> For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter
> the value of net.sctp.prsctp_enable, the throughput is almost the same:

The perf-profile data for the two commits are attached(for the case of
prsctp_enable=1, the perf-profile data doesn't get collected for the 0
case for some reason, I'm checking the problem now).

The CPU gets much more idle time in the bisected commit a6c2f79287:

68.89% 0.70% [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath
49.32% 0.12% [kernel.kallsyms] [k] sys_sendmsg
49.17% 0.12% [kernel.kallsyms] [k] __sys_sendmsg
48.58% 0.22% [kernel.kallsyms] [k] ___sys_sendmsg
46.69% 0.06% [kernel.kallsyms] [k] sock_sendmsg
46.31% 0.16% [kernel.kallsyms] [k] inet_sendmsg
45.90% 0.98% [kernel.kallsyms] [k] sctp_sendmsg
29.66% 0.45% [kernel.kallsyms] [k] sctp_do_sm
29.54% 0.23% [kernel.kallsyms] [k] cpu_startup_entry
28.81% 0.68% [kernel.kallsyms] [k] sctp_cmd_interpreter.isra.24
26.20% 0.00% [kernel.kallsyms] [k] start_secondary
23.04% 0.09% [kernel.kallsyms] [k] sctp_inq_push
23.03% 0.08% [kernel.kallsyms] [k] call_cpuidle
22.94% 0.00% [kernel.kallsyms] [k] cpuidle_enter
22.60% 0.18% [kernel.kallsyms] [k] cpuidle_enter_state
21.99% 21.99% [kernel.kallsyms] [k] intel_idle
... ...

While its immediate parent commit 826d253d57 is mostly busy working:

98.53% 0.83% [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath
78.13% 0.12% [kernel.kallsyms] [k] sys_sendmsg
78.03% 0.16% [kernel.kallsyms] [k] __sys_sendmsg
77.08% 0.28% [kernel.kallsyms] [k] ___sys_sendmsg
74.44% 0.08% [kernel.kallsyms] [k] sock_sendmsg
73.82% 0.13% [kernel.kallsyms] [k] inet_sendmsg
73.34% 1.44% [kernel.kallsyms] [k] sctp_sendmsg
47.52% 0.75% [kernel.kallsyms] [k] sctp_do_sm
46.19% 0.90% [kernel.kallsyms] [k] sctp_cmd_interpreter.isra.24
37.17% 1.43% [kernel.kallsyms] [k] sctp_outq_flush
36.93% 0.08% [kernel.kallsyms] [k] sctp_outq_uncork
34.24% 0.15% [kernel.kallsyms] [k] sctp_inq_push
... ...
No idle related function above 1%.

Will the bisected commit make the idle possible?

Thanks,
Aaron

Attachment: perf-profile-a6c2f79287.gz
Description: application/gzip

Attachment: perf-profile-826d253d57.gz
Description: application/gzip