[LKP] [x86_64, entry] 103dd9885d1: +7.3% will-it-scale.per_process_ops

From: Huang Ying
Date: Tue Feb 03 2015 - 02:47:58 EST


FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git x86/entry
commit 103dd9885d1e96be0c7493ce9a33af0732e35441 ("x86_64, entry: Use sysret to return to userspace when possible")


testbox/testcase/testparams: lkp-sb03/will-it-scale/open2

261ebee6aee55a47 103dd9885d1e96be0c7493ce9a
---------------- --------------------------
%stddev %change %stddev
\ | \
35.96 Â 6% -39.8% 21.66 Â 7% will-it-scale.time.user_time
237495 Â 1% +5.2% 249861 Â 1% will-it-scale.per_thread_ops
739981 Â 0% +7.3% 793911 Â 0% will-it-scale.per_process_ops
127 Â 20% -66.9% 42 Â 30% sched_debug.cfs_rq[14]:/.load
127 Â 20% -66.9% 42 Â 30% sched_debug.cpu#14.load
58 Â 21% -39.3% 35 Â 43% sched_debug.cfs_rq[7]:/.load
58 Â 21% -39.3% 35 Â 43% sched_debug.cpu#7.load
3442 Â 25% -42.5% 1978 Â 20% sched_debug.cpu#28.sched_goidle
3591 Â 36% +128.5% 8204 Â 17% sched_debug.cpu#10.ttwu_local
8241 Â 21% -36.9% 5202 Â 19% sched_debug.cpu#28.nr_switches
5411 Â 18% -39.0% 3301 Â 13% sched_debug.cpu#14.ttwu_local
8800 Â 15% -27.8% 6356 Â 20% sched_debug.cpu#9.ttwu_count
42 Â 22% -33.3% 28 Â 47% sched_debug.cfs_rq[14]:/.runnable_load_avg
67 Â 46% +87.1% 126 Â 31% sched_debug.cpu#12.load
573932 Â 1% +59.5% 915353 Â 1% slabinfo.kmalloc-256.active_objs
8970 Â 1% +59.5% 14305 Â 1% slabinfo.kmalloc-256.num_slabs
8970 Â 1% +59.5% 14305 Â 1% slabinfo.kmalloc-256.active_slabs
574152 Â 1% +59.5% 915575 Â 1% slabinfo.kmalloc-256.num_objs
35.96 Â 6% -39.8% 21.66 Â 7% time.user_time
1358 Â 24% -31.9% 924 Â 34% sched_debug.cpu#31.ttwu_local
2487 Â 14% -32.6% 1677 Â 9% sched_debug.cpu#31.sched_goidle
17643 Â 8% -33.6% 11719 Â 5% sched_debug.cpu#14.nr_switches
8632 Â 11% -34.9% 5617 Â 14% sched_debug.cpu#14.ttwu_count
29 Â 12% +31.9% 38 Â 15% sched_debug.cpu#12.cpu_load[1]
115453 Â 3% +48.3% 171178 Â 2% numa-meminfo.node0.SUnreclaim
28672 Â 2% +49.2% 42775 Â 2% numa-vmstat.node0.nr_slab_unreclaimable
3313 Â 15% +35.0% 4472 Â 13% sched_debug.cpu#11.curr->pid
132090 Â 3% +42.8% 188646 Â 2% numa-meminfo.node0.Slab
200693 Â 1% +42.8% 286535 Â 0% meminfo.SUnreclaim
50431 Â 1% +42.0% 71624 Â 1% proc-vmstat.nr_slab_unreclaimable
26 Â 12% +29.8% 33 Â 12% sched_debug.cpu#12.cpu_load[4]
20434 Â 2% +38.4% 28286 Â 2% numa-vmstat.node1.nr_slab_unreclaimable
82040 Â 2% +36.9% 112273 Â 2% numa-meminfo.node1.SUnreclaim
14623 Â 26% +55.6% 22751 Â 12% sched_debug.cpu#10.nr_switches
235487 Â 1% +36.5% 321407 Â 0% meminfo.Slab
842597 Â 20% -30.2% 588471 Â 26% sched_debug.cfs_rq[27]:/.min_vruntime
6502 Â 25% +54.2% 10028 Â 11% sched_debug.cpu#10.sched_goidle
100197 Â 2% +29.4% 129675 Â 2% numa-meminfo.node1.Slab
237178 Â 1% -23.2% 182260 Â 2% softirqs.RCU
7310 Â 4% -26.2% 5397 Â 6% sched_debug.cpu#14.sched_goidle
6148 Â 12% -22.7% 4751 Â 10% sched_debug.cpu#31.nr_switches
44905 Â 20% -28.5% 32101 Â 25% sched_debug.cfs_rq[27]:/.exec_clock
32 Â 6% +38.3% 44 Â 19% sched_debug.cpu#1.cpu_load[0]
139 Â 15% -25.4% 104 Â 18% sched_debug.cfs_rq[23]:/.tg_load_contrib
3454 Â 14% -17.7% 2844 Â 6% sched_debug.cpu#31.curr->pid
438 Â 8% -11.5% 388 Â 5% sched_debug.cfs_rq[31]:/.tg_runnable_contrib
30 Â 1% +11.6% 33 Â 6% sched_debug.cpu#23.cpu_load[4]
20115 Â 7% -11.3% 17841 Â 6% sched_debug.cfs_rq[31]:/.avg->runnable_avg_sum
35 Â 11% -19.0% 28 Â 9% sched_debug.cpu#21.cpu_load[0]
32 Â 6% +31.2% 42 Â 19% sched_debug.cfs_rq[1]:/.runnable_load_avg
32 Â 11% +27.1% 41 Â 10% sched_debug.cpu#9.cpu_load[2]
30 Â 1% +9.9% 33 Â 6% sched_debug.cpu#23.cpu_load[3]
575 Â 5% +12.4% 647 Â 6% sched_debug.cfs_rq[23]:/.tg_runnable_contrib
26412 Â 5% +12.5% 29710 Â 6% sched_debug.cfs_rq[23]:/.avg->runnable_avg_sum
29 Â 18% +33.6% 38 Â 14% sched_debug.cpu#12.cpu_load[0]
32 Â 5% +32.1% 43 Â 17% sched_debug.cpu#1.cpu_load[1]
33 Â 6% +26.3% 42 Â 16% sched_debug.cpu#1.cpu_load[2]
424232 Â 2% +13.7% 482375 Â 1% numa-meminfo.node0.MemUsed
3558 Â 6% +14.5% 4076 Â 3% sched_debug.cpu#15.curr->pid
1567 Â 0% -9.3% 1422 Â 1% vmstat.system.cs

testbox/testcase/testparams: lkp-sb03/will-it-scale/signal1

261ebee6aee55a47 103dd9885d1e96be0c7493ce9a
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
40.59 Â 3% -33.0% 27.21 Â 3% will-it-scale.time.user_time
0.11 Â 0% -10.5% 0.10 Â 0% will-it-scale.scalability
297467 Â 0% +10.2% 327822 Â 0% will-it-scale.per_thread_ops
526508 Â 0% +7.2% 564555 Â 0% will-it-scale.per_process_ops
2527 Â 43% -74.9% 633 Â 20% sched_debug.cpu#24.ttwu_local
3355 Â 33% -66.8% 1113 Â 14% sched_debug.cpu#24.ttwu_count
6776 Â 38% -61.6% 2601 Â 11% sched_debug.cpu#24.nr_switches
47 Â 44% -59.6% 19 Â 12% sched_debug.cfs_rq[10]:/.load
48 Â 40% -60.8% 19 Â 12% sched_debug.cpu#10.load
184 Â 33% -63.1% 68 Â 5% sched_debug.cfs_rq[1]:/.blocked_load_avg
80 Â 36% +97.8% 158 Â 34% sched_debug.cfs_rq[15]:/.blocked_load_avg
98 Â 30% +77.7% 175 Â 31% sched_debug.cfs_rq[15]:/.tg_load_contrib
215 Â 28% -54.3% 98 Â 3% sched_debug.cfs_rq[1]:/.tg_load_contrib
212 Â 32% -60.7% 83 Â 39% sched_debug.cfs_rq[26]:/.blocked_load_avg
238 Â 34% -56.0% 104 Â 35% sched_debug.cfs_rq[26]:/.tg_load_contrib
1175 Â 15% +82.0% 2139 Â 36% sched_debug.cpu#30.sched_goidle
40.59 Â 3% -33.0% 27.21 Â 3% time.user_time
183 Â 23% -32.1% 124 Â 27% sched_debug.cfs_rq[19]:/.tg_load_contrib
3095 Â 11% +81.1% 5606 Â 43% sched_debug.cpu#30.nr_switches
14 Â 5% +39.7% 20 Â 19% sched_debug.cpu#31.cpu_load[0]
1296 Â 11% +28.2% 1661 Â 15% sched_debug.cpu#5.ttwu_count
43 Â 9% +21.5% 52 Â 10% sched_debug.cpu#0.cpu_load[0]
59711 Â 0% -11.8% 52641 Â 7% sched_debug.cfs_rq[5]:/.exec_clock
29 Â 0% +13.8% 33 Â 7% sched_debug.cpu#21.cpu_load[0]
29 Â 6% -12.6% 26 Â 4% sched_debug.cfs_rq[5]:/.runnable_load_avg
30 Â 5% -13.2% 26 Â 6% sched_debug.cpu#5.load
29 Â 1% +12.0% 32 Â 6% sched_debug.cpu#21.cpu_load[2]
29 Â 0% +14.7% 33 Â 6% sched_debug.cpu#21.cpu_load[1]
15 Â 2% +24.6% 19 Â 18% sched_debug.cpu#31.cpu_load[1]
15 Â 2% +19.7% 18 Â 15% sched_debug.cpu#31.cpu_load[2]
30 Â 5% -12.4% 26 Â 6% sched_debug.cfs_rq[5]:/.load
29 Â 0% +13.8% 33 Â 7% sched_debug.cfs_rq[21]:/.runnable_load_avg
29 Â 0% +13.8% 33 Â 7% sched_debug.cfs_rq[21]:/.load
29 Â 0% +13.8% 33 Â 7% sched_debug.cpu#21.load

lkp-sb03: Sandy Bridge-EP
Memory: 64G




will-it-scale.time.user_time

45 ++---------------------------------------------------------------------+
| |
40 ++ .*.. .*.. .*.. |
35 *+.*..*.. .*..* *..*..*. *..*. *..*..*.. ..*..*. *.. .*
| *. : : *. *. |
30 ++ : : |
25 ++ : : |
O O O O O O: O :O O O O O O O O O O O |
20 ++ : : O |
15 ++ : : |
| : : |
10 ++ : : |
5 ++ : : |
| : |
0 ++-----------------*---------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample

To reproduce:

apt-get install ruby ruby-oj
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Huang, Ying


---
testcase: will-it-scale
default-monitors:
wait: pre-test
uptime:
iostat:
vmstat:
numa-numastat:
numa-vmstat:
numa-meminfo:
proc-vmstat:
proc-stat:
meminfo:
slabinfo:
interrupts:
lock_stat:
latency_stats:
softirqs:
bdi_dev_mapping:
diskstats:
nfsstat:
cpuidle:
cpufreq-stats:
turbostat:
pmeter:
sched_debug:
interval: 10
default_watchdogs:
watch-oom:
watchdog:
cpufreq_governor:
commit: 69d668f2356670ab6d59e17c911a222d9546f50b
model: Sandy Bridge-EP
memory: 64G
hdd_partitions: "/dev/disk/by-id/ata-WDC_WD10EARS-00Y5B1_WD-WCAV5F059499-part3"
swap_partitions:
rootfs_partition: "/dev/disk/by-id/ata-WDC_WD10EARS-00Y5B1_WD-WCAV5F059499-part4"
perf-profile:
freq: 800
will-it-scale:
test:
- open2
testbox: lkp-sb03
tbox_group: lkp-sb03
kconfig: x86_64-rhel
enqueue_time: 2015-01-31 12:02:48.220308404 +08:00
head_commit: 69d668f2356670ab6d59e17c911a222d9546f50b
base_commit: 26bc420b59a38e4e6685a73345a0def461136dce
branch: linux-devel/devel-hourly-2015013122
kernel: "/kernel/x86_64-rhel/69d668f2356670ab6d59e17c911a222d9546f50b/vmlinuz-3.19.0-rc6-g69d668f"
user: lkp
queue: cyclic
rootfs: debian-x86_64-2015-01-30.cgz
result_root: "/result/lkp-sb03/will-it-scale/open2/debian-x86_64-2015-01-30.cgz/x86_64-rhel/69d668f2356670ab6d59e17c911a222d9546f50b/0"
job_file: "/lkp/scheduled/lkp-sb03/cyclic_will-it-scale-open2-x86_64-rhel-HEAD-69d668f2356670ab6d59e17c911a222d9546f50b-0.yaml"
dequeue_time: 2015-02-01 00:26:13.233880269 +08:00
nr_cpu: "$(nproc)"
job_state: finished
loadavg: 22.16 12.46 5.08 1/296 11219
start_time: '1422721619'
end_time: '1422721929'
version: "/lkp/lkp/.src-20150131-074115"
./runtest.py open2 25 both 1 8 16 24 32
_______________________________________________
LKP mailing list
LKP@xxxxxxxxxxxxxxx