Re: [PATCH 1/3] perf pmu: Limit PMU cpumask to online CPUs

From: Yicong Yang
Date: Tue Jun 04 2024 - 07:13:04 EST


On 2024/6/4 0:52, Ian Rogers wrote:
> On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@xxxxxxxxxx> wrote:
>>
>> From: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>>
>> We'll initialize the PMU's cpumask from "cpumask" or "cpus" sysfs
>> attributes if provided by the driver without checking the CPUs
>> are online or not. In such case that CPUs provided by the driver
>> contains the offline CPUs, we'll try to open event on the offline
>> CPUs and then rejected by the kernel:
>>
>> [root@localhost yang]# echo 0 > /sys/devices/system/cpu/cpu0/online
>> [root@localhost yang]# ./perf_static stat -e armv8_pmuv3_0/cycles/ --timeout 100
>> Error:
>> The sys_perf_event_open() syscall returned with 19 (No such device) for event (cpu-clock).
>> /bin/dmesg | grep -i perf may provide additional information.
>>
>> So it's better to do a double check in the userspace and only include
>> the online CPUs from "cpumask" or "cpus" to avoid opening events on
>> offline CPUs.
>
> I see where you are coming from with this but I think it is wrong. The
> cpus for an uncore PMU are a hint of the CPU to open on rather than
> the set of valid CPUs. For example:
> ```
> $ cat /sys/devices/uncore_imc_free_running_0/cpumask
> 0
> $ perf stat -vv -e uncore_imc_free_running_0/data_read/ -C 1 -a sleep 0.1
> Using CPUID GenuineIntel-6-8D-1
> Attempt to add: uncore_imc_free_running_0/data_read/
> ..after resolving event: uncore_imc_free_running_0/event=0xff,umask=0x20/
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
> type 24 (uncore_imc_free_running_0)
> size 136
> config 0x20ff (data_read)
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
> sys_perf_event_open failed, error -22
> switching off cloexec flag
> ------------------------------------------------------------
> perf_event_attr:
> type 24 (uncore_imc_free_running_0)
> size 136
> config 0x20ff (data_read)
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0
> sys_perf_event_open failed, error -22
> switching off exclude_guest, exclude_host
> ------------------------------------------------------------
> perf_event_attr:
> type 24 (uncore_imc_free_running_0)
> size 136
> config 0x20ff (data_read)
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0 = 3
> uncore_imc_free_running_0/data_read/: 1: 4005984 102338957 102338957
> uncore_imc_free_running_0/data_read/: 4005984 102338957 102338957
>
> Performance counter stats for 'system wide':
>
> 244.51 MiB uncore_imc_free_running_0/data_read/
>
> 0.102320376 seconds time elapsed
> ```
> So the CPU mask of the PMU says to open on CPU 0, but on the command
> line when I passed "-C 1" it opened it on CPU 1. If the cpumask file
> contained an offline CPU then this change would make it so the CPU map
> in the tool were empty, however, a different CPU may be programmable
> and online.
>
> Fwiw, the tool will determine whether the mask is for all valid or a
> hint by using the notion of a PMU being "core" or not. That notion
> considers whether the mask was loading from a "cpumask" or "cpus"
> file:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmu.c?h=perf-tools-next#n810
>

I see, you're correct on this. Maybe I didn't mention it clearly in the commit,
this patch doesn't intend to change the case that user specify the CPUs explicitly.
It'll have difference in case user doesn't specify the CPU list while the PMU's
'cpus' or 'cpumask' contains offline CPUs: with this patch we'll use the online
CPUs in the PMU's 'cpus' or 'cpumask' but before this we may use the offline CPUs.
We still honor the user's input by the handling in __perf_evlist__propagate_maps().

Thanks.