Re: [PATCH 2/3] perf mem: Fix missed p-core mem events on ADL and RPL
From: Liang, Kan
Date: Fri Sep 06 2024 - 12:09:17 EST
On 2024-09-06 10:17 a.m., Arnaldo Carvalho de Melo wrote:
> On Thu, Sep 05, 2024 at 03:47:03PM -0400, Liang, Kan wrote:
>> On 2024-09-05 3:33 p.m., Arnaldo Carvalho de Melo wrote:
>>> On Thu, Sep 05, 2024 at 10:07:36AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
>>>> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>>>
>>>> The p-core mem events are missed when launching perf mem record on ADL
>>>> and RPL.
>>>>
>>>> root@number:~# perf mem record sleep 1
>>>> Memory events are enabled on a subset of CPUs: 16-27
>>>> [ perf record: Woken up 1 times to write data ]
>>>> [ perf record: Captured and wrote 0.032 MB perf.data ]
>>>> root@number:~# perf evlist
>>>> cpu_atom/mem-loads,ldlat=30/P
>>>> cpu_atom/mem-stores/P
>>>> dummy:u
>>>>
>>>> A variable 'record' in the struct perf_mem_event is to indicate whether
>>>> a mem event in a mem_events[] should be recorded. The current code only
>>>> configure the variable for the first eligible PMU. It's good enough for
>>>> a non-hybrid machine or a hybrid machine which has the same
>>>> mem_events[]. However, if a different mem_events[] is used for different
>>>> PMUs on a hybrid machine, e.g., ADL or RPL, the 'record' for the second
>>>> PMU never get a chance to be set. The mem_events[] of the second PMU
>>>> are always ignored.
>>>>
>>>> Perf mem doesn't support the per-PMU configuration now. A
>>>> per-PMU mem_events[] 'record' variable doesn't make sense. Make it
>>>> global. That could also avoid searching for the per-PMU mem_events[]
>>>> via perf_pmu__mem_events_ptr every time.
>>>>
>>>> Fixes: abbdd79b786e ("perf mem: Clean up perf_mem_events__name()")
>>>> Reported-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>>>> Closes: https://lore.kernel.org/lkml/Zthu81fA3kLC2CS2@x1/
>>>> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>>
>>> Looks better:
>>>
>>> root@number:~# perf report --header-only | grep 'cmdline\|event'
>>> # cmdline : /home/acme/bin/perf mem record ls
>>> # event : name = cpu_atom/mem-loads,ldlat=30/P, , id = { 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511 }, type = 10 (cpu_atom), size = 136, config = 0x5d0 (mem-loads), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1, { bp_addr, config1 } = 0x1f
>>> # event : name = cpu_atom/mem-stores/P, , id = { 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523 }, type = 10 (cpu_atom), size = 136, config = 0x6d0 (mem-stores), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1
>>> # event : name = cpu_core/mem-loads-aux/, , id = { 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539 }, type = 4 (cpu_core), size = 136, config = 0x8203 (mem-loads-aux), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1
>>> # event : name = cpu_core/mem-loads,ldlat=30/, , id = { 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556 }, type = 4 (cpu_core), size = 136, config = 0x1cd (mem-loads), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, inherit = 1, freq = 1, precise_ip = 2, sample_id_all = 1, exclude_guest = 1, { bp_addr, config1 } = 0x1f
>>> # event : name = cpu_core/mem-stores/P, , id = { 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572 }, type = 4 (cpu_core), size = 136, config = 0x2cd (mem-stores), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1
>>> # event : name = dummy:u, , id = { 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600 }, type = 1 (software), size = 136, config = 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|ADDR|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, inherit = 1, exclude_kernel = 1, exclude_hv = 1, mmap = 1, comm = 1, task = 1, mmap_data = 1, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
>>> # intel_pt pmu capabilities: topa_multiple_entries=1, psb_cyc=1, single_range_output=1, mtc_periods=249, ip_filtering=1, output_subsys=0, cr3_filtering=1, psb_periods=3f, event_trace=0, cycle_thresholds=3f, power_event_trace=0, mtc=1, payloads_lip=0, ptwrite=1, num_address_ranges=2, max_subleaf=1, topa_output=1, tnt_disable=0
>>> root@number:~# perf evlist
>>> cpu_atom/mem-loads,ldlat=30/P
>>> cpu_atom/mem-stores/P
>>> cpu_core/mem-loads-aux/
>>> cpu_core/mem-loads,ldlat=30/
>>> cpu_core/mem-stores/P
>>> dummy:u
>>> root@number:~#
>>>
>>> But can we reconstruct the events relationship (group, :S, etc) from
>>> what we have in the perf.data header?
>>>
>>
>> Do you mean show the group relation in the perf evlist?
>>
>> $perf mem record sleep 1
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.027 MB perf.data (10 samples) ]
>>
>> $perf evlist -g
>> cpu_atom/mem-loads,ldlat=30/P
>> cpu_atom/mem-stores/P
>> {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
>> cpu_core/mem-stores/P
>> dummy:u
>>
>> The -g option already did it, although the group modifier looks lost.
>
> Right, I can reproduce that, but I wonder if we shouldn't make this '-g'
> option the default?
I think the evlist means a list of events. Only outputting the events
makes sense to me.
With -g, the extra relationship information is provided.
>
> -----
>
> Committer testing:
>
> root@number:~# perf evlist -g
> cpu_atom/mem-loads,ldlat=30/P
> cpu_atom/mem-stores/P
> {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
> cpu_core/mem-stores/P
> dummy:u
> root@number:~#
>
> The :S for '{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}' is
> not being added by 'perf evlist -g', to be checked.
>
> -----
It should be a generic issue, not just for perf evlist -g.
The same issue can be observed for perf report.
$perf report --header-only | grep 'cmdline\|group'
# cmdline : /home/kan/tmp/perf-tools-next/tools/perf/perf record -e
{cycles,instructions}:u sleep 1
# group: {cycles,instructions}
I think it's because the per-group modifiers is converted to per-event
modifiers and stored in the evsel when parsing the group. It's hard to
reconstruct the accurate group strings only relying on the evsel, unless
we record the group string somewhere, e.g., leader evsel, when parsing it.
Thanks,
Kan