Re: [PATCH 3/5] perf core: Prepare writing into ring buffer from end

From: Wangnan (F)
Date: Sun Mar 27 2016 - 22:58:51 EST

On 2016/3/28 9:58, Wangnan (F) wrote:

On 2016/3/28 9:07, Wangnan (F) wrote:

On 2016/3/27 23:30, pi3orama wrote:

åèæç iPhone

å 2016å3æ27æïäå11:20ïPeter Zijlstra <peterz@xxxxxxxxxxxxx> åéï

On Fri, Mar 25, 2016 at 10:14:36PM +0800, Wangnan (F) wrote:
I think you enabled some unusual config options?

You must enabled CONFIG_OPTIMIZE_INLINING. Now I get similar result:
It has that indeed.


Test its performance by calling 'close(-1)' for 3000000 times and
use 'perf record -o /dev/null -e raw_syscalls:* test-ring-buffer' to
capture system calls:

BASE 800077.1 23448.13
RAWPERF.PRE 2465858.0 603473.70
RAWPERF.POST 2471925.0 609437.60

Considering the high stdvar, after applying this patch the performance
is not change.
Why is your variance so immense? And doesn't that render the
measurements pointless?

For some unknown reason, about
10% of these results raises 2 times of normal
results. Say, "normal results" are about
2200000, but those "outliers" are about
4400000 (I can't access raw data now).
Variance becomes much smaller if I remove
those outliers.

Find the reason of these outliners.

If perf and 'test-ring-buffer' are scheduled on different processors,
the performance is bad. I think cache is the main reason.

I will redo the test, bind them to cores on same CPU.

Thank you.

Test method improvements:

1. Set CPU freq:

# for f in /sys/devices/system/cpu/cpufreq/policy*/scaling_governor ; do echo performance > $f ; done

2. Bind core:
Add following code into head of test-ring-buffer:

CPU_SET(6, &mask);
pthread_setaffinity_np(pthread_self(), sizeof(mask), &mask);

3. Bind core (perf):

Use following command to start perf:

# taskset -c 7 ./perf record -o /dev/null --no-buildid-cache -e raw_syscalls:* test-ring-buffer

New result of 100 test data in both cases:

BASE 800214.950 2853.083
RAWPERF.PRE 2253846.700 9997.014
RAWPERF.POST 2257495.540 8516.293

Thank you.