sched: some perf bench around per entity load tracking

From: Vincent Guittot
Date: Tue Jun 09 2015 - 11:52:27 EST


Hi,

There are on going patches on the mailing list that modify the
scheduler load tracking area.
+Yuyang has rewritten the per entity load tracking:
https://lkml.org/lkml/2015/6/2/124
+Morten has also done some modification on the load tracking:
https://lkml.org/lkml/2015/5/13/448. Patches 01-12 modifies the load
tracking area. I haven't considered the end of the patchset which
implements the energy awareness, which it is out of the scope of the
tests i wanted to do.

In order to have a better idea of the impact of each patchset on the
performance of the scheduler, i have run some benches on a quad ARM
cortex A15 platform.

The list of bench that i have run:
-perf bench sched pipe -l 1000000
-hackbench --loops 400 --datasize 4096
-memcpy
-sysbench test=threads
-sysbench test=cpu
-ebizzy.

Here are the results:

main: mainline kernel based on v4.1-rc6
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/ sha1
9ef7adfa7c0b548665ef3248228d548586e693ca
pelt: main + yuyang's patches
inv: main + morten's patches

I have also run the bench with and without CONFIG_SCHED_MC. The only
impact of this config on my platform is the setting of a llc sched
domain pointer when the config is set.

die: CONFIG_SCHED_MC is not. There is 1 of sched_domain (the die
level) and sched_domain flags : 0x102f
mc: CONFIG_SCHED_MC is set. There is 1 of sched_domain (the mc level)
and sched_domain flags : 0x22f

main+die main+mc pelt+die pelt+mc
inv+die inv+mc
sched pipe ops/sec 45091.40 44.30% 83.40% 43.78%
83.35% 40.42%
+/- 0.33% 3.79% 0.30% 2.63%
0.30% 0.60%
hackbench duration 7.84 98.33% 99.27% 95.51%
99.03% 97.54%
+/- 0.37% 0.88% 1.08% 0.61%
1.18% 1.13%
memcopy MB/s 4950.47 102.76% 100.76% 99.03%
99.44% 102.59%
+/- 4.09% 6.13% 4.98% 2.19%
5.30% 7.26%

sysbench test=threads
2 thrds/1 lock events 5891.50 91.81% 94.81% 88.99%
99.18% 91.15%
+/- 0.39% 0.63% 0.92% 0.69%
0.43% 1.10%
3 thrds/1 lock events 4061.83 86.10% 90.44% 82.56%
100.59% 86.45%
+/- 1.28% 2.08% 3.76% 1.11%
0.44% 1.87%
4 thrds/2 locks events 6203.83 86.19% 89.09% 83.05%
99.61% 86.26%
+/- 1.69% 1.41% 2.78% 1.64%
0.88% 0.92%
5 thrds/2 locks events 4062.00 137.43% 130.77% 132.53%
93.67% 136.80%
+/- 0.59% 0.89% 2.56% 2.06%
1.05% 1.29%
6 thrds/3 locks events 5531.00 159.52% 109.85% 151.88%
96.11% 159.00%
+/- 1.59% 0.78% 1.76% 1.37%
2.72% 1.04%

ebizzy
1 thread records/s 6040.50 100.68% 99.60% 101.05%
98.64% 97.42%
+/- 1.97% 1.50% 1.75% 0.90%
1.66% 0.95%
2 threads records/s 9278.50 100.59% 101.21% 100.64%
100.71% 99.05%
+/- 2.82% 0.86% 0.59% 0.63%
0.88% 1.50%
3 threads records/s 11205.33 99.75% 101.41% 100.98%
100.16% 97.64%
+/- 2.76% 2.13% 2.30% 1.51%
3.26% 2.58%
4 threads records/s 10970.00 102.78% 99.59% 102.00%
107.24% 106.10%
+/- 3.39% 4.68% 3.63% 5.75%
4.07% 4.41%
5 threads records/s 11716.50 95.57% 93.81% 96.36%
98.51% 96.81%
+/- 3.52% 4.95% 4.50% 5.27%
4.28% 5.51%
6 threads records/s 11209.33 99.42% 100.33% 97.86%
99.38% 95.75%
+/- 3.57% 2.95% 5.16% 6.84%
3.70% 3.57%
7 threads records/s 11204.50 99.55% 99.31% 95.73%
99.02% 96.55%
+/- 4.54% 4.22% 5.39% 3.71%
5.36% 3.69%
8 threads records/s 17210.83 99.57% 100.65% 99.80%
100.16% 100.37%
+/- 2.01% 1.88% 1.22% 2.25%
2.86% 1.69%

I have skipped the results of sysbench cpu as they are "exactly" the
same with all kernels.

The 1st noticeable point is the impact of the LLC on the sched pipe
and on hackbench in a less extent

Then, the results don't show any clear performance advantage for 1 of
the 3 kernels.

I have just seen that Yuyang has sent some performance figures for his
patchset and AFAICT, there is no clear perf advantage for one version
of the kernel.

Have anyone else also run some bench of these patchsets ?

Regards,
Vincent
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/