Re: [PATCH v3 1/n] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi

From: Alexey Budankov
Date: Mon Jun 19 2017 - 11:27:55 EST

On 19.06.2017 18:14, Mark Rutland wrote:
On Mon, Jun 19, 2017 at 04:37:41PM +0300, Alexey Budankov wrote:
On 19.06.2017 16:26, Mark Rutland wrote:
On Mon, Jun 19, 2017 at 04:08:32PM +0300, Alexey Budankov wrote:
On 16.06.2017 1:10, Alexey Budankov wrote:
On 15.06.2017 22:56, Mark Rutland wrote:
On Thu, Jun 15, 2017 at 08:41:42PM +0300, Alexey Budankov wrote:
This series of patches continues v2 and addresses captured comments.

Specifically this patch replaces pinned_groups and flexible_groups
lists of perf_event_context by red-black cpu indexed trees avoiding
data structures duplication and introducing possibility to iterate
event groups for a specific CPU only.

Have you thrown Vince's perf fuzzer at this?

If you haven't, please do. It can be found in the fuzzer directory of:


I run the test suite and it revealed no additional regressions in
comparison to what I have on the clean kernel.

However the fuzzer constantly reports some strange stacks that are
not seen on the clean kernel and I have no idea how that might be
caused by the patches.

Ok; that was the kind of thing I was concerned about.

What you say "strange stacks", what do you mean exactly?

I take it the kernel spewing backtraces in dmesg?

Can you dump those?

Here it is:

list_del corruption. prev->next should be ffff88c2c4654010, but was
[ 607.632813] ------------[ cut here ]------------
[ 607.632816] kernel BUG at lib/list_debug.c:53!

[ 607.635531] Call Trace:
[ 607.635583] list_del_event+0x1d7/0x210

Given this patch changes how list_{del,add}_event() works, it's possible
that this is a new bug.

I was going to try to test this on arm64, but I couldn't get the patch
to apply. I had a go with v4.12-rc5, tip/perf/core, and tip/perf/urgent.

Which branch should I be using as the base?

perf/core d0fabd1 [origin/perf/core] perf/core: Remove unused perf_cgroup_event_cgrp_time() function