Re: [PATCH v2 0/3] perf-stat: share hardware PMCs with BPF

From: Song Liu
Date: Fri Mar 19 2021 - 12:15:23 EST




> On Mar 19, 2021, at 8:58 AM, Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> Hi Arnaldo,
>
> On Sat, Mar 20, 2021 at 12:35 AM Arnaldo Carvalho de Melo
> <acme@xxxxxxxxxx> wrote:
>>
>> Em Fri, Mar 19, 2021 at 09:54:59AM +0900, Namhyung Kim escreveu:
>>> On Fri, Mar 19, 2021 at 9:22 AM Song Liu <songliubraving@xxxxxx> wrote:
>>>>> On Mar 18, 2021, at 5:09 PM, Arnaldo <arnaldo.melo@xxxxxxxxx> wrote:
>>>>> On March 18, 2021 6:14:34 PM GMT-03:00, Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
>>>>>> On Thu, Mar 18, 2021 at 03:52:51AM +0000, Song Liu wrote:
>>>>>>> perf stat -C 1,3,5 107.063 [sec]
>>>>>>> perf stat -C 1,3,5 --bpf-counters 106.406 [sec]
>>
>>>>>> I can't see why it's actualy faster than normal perf ;-)
>>>>>> would be worth to find out
>>
>>>>> Isn't this all about contended cases?
>>
>>>> Yeah, the normal perf is doing time multiplexing; while --bpf-counters
>>>> doesn't need it.
>>
>>> Yep, so for uncontended cases, normal perf should be the same as the
>>> baseline (faster than the bperf). But for contended cases, the bperf
>>> works faster.
>>
>> The difference should be small enough that for people that use this in a
>> machine where contention happens most of the time, setting a
>> ~/.perfconfig to use it by default should be advantageous, i.e. no need
>> to use --bpf-counters on the command line all the time.
>>
>> So, Namhyung, can I take that as an Acked-by or a Reviewed-by? I'll take
>> a look again now but I want to have this merged on perf/core so that I
>> can work on a new BPF SKEL to use this:
>
> I have a concern for the per cpu target, but it can be done later, so
>
> Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>
>
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.bpf/bpf_perf_enable
>
> Interesting! Actually I was thinking about the similar too. :)

Hi Namhyung, Jiri, and Arnaldo,

Thanks a lot for your kind review.

Here is updated 3/3, where we use perf-bench instead of stressapptest.

Thanks,
Song