Re: [PATCH v5 4/4]: perf/core: complete replace of lists by rb trees for pinned and flexible groups at perf_event_context
From: Alexey Budankov
Date: Tue Jul 18 2017 - 09:38:37 EST
Hi,
On 18.07.2017 14:33, Alexander Shishkin wrote:
> Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx> writes:
>
>> Hi,
>
> Hi,
>
>> Are there any new comments so far? Could you please suggest further steps forward?
>
> Apparently the patches are not threaded, so one needs to fish them out
> one by one in order to review.
Thanks for feedback. The BKM in this case is just to keep subject unchanged, right?
>
>> On 10.07.2017 16:03, Alexey Budankov wrote:
>>> perf/core: complete replace of lists by rb trees for pinned and
>>> flexible groups at perf_event_context
>
> No need to duplicate the subject line here. Also, it can be more concise
> than this like "perf: Replace context's pinned/flexible lists with trees".
>
>>> By default, the userspace perf tool opens per-cpu task-bound events
>>> when sampling, so for N logical events requested by the user, the tool
>>> will open N * NR_CPUS events.
>>>
>>> In the kernel, we mux events with a hrtimer, periodically rotating the
>>> flexible group list and trying to schedule each group in turn. We skip
>>> groups whose cpu filter doesn't match. So when we get unlucky, we can
>>> walk N * (NR_CPUS - 1) groups pointlessly for each hrtimer invocation.
>>>
>>> This has been observed to result in significant overhead when running
>>> the STREAM benchmark on 272 core Xeon Phi systems.
>>>
>>> One way to avoid this is to place our events into an rb tree sorted by
>>> CPU filter, so that our hrtimer can skip to the current CPU's
>>> list and ignore everything else.
>
> It looks like these 4 paragraphs are repeated in every patch.
>
>>> This patch implements complete replacement of lists by rb trees for
>>> pinned and flexible groups.
>
> And this is the actually informative part.
>
>>> The patch set was tested on Xeon Phi using perf_fuzzer and tests
>>> from here: https://github.com/deater/perf_event_tests
>
> Although this is also useful.
>
>>> The full patch set (v1-4) is attached for convenience.
>>>
>>> Branch revision:
>>> * perf/core 007b811b4041989ec2dc91b9614aa2c41332723e
>>> Merge tag 'perf-core-for-mingo-4.13-20170719' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
>
> Not sure what this is, though.
Just mentioned branch and revision when forking local development branch.
Considered it helpful for applying the whole patch set.
>
> As has been recently pointed out elsewhere, you can get a good idea of
> how to structure and format commit messages for a particular piece of
> code by looking at 'git log path/to/code' and paying attention to common
> patterns.
>
> Thanks,
> --
> Alex
>
Thanks,
Alexey