Re: [PATCH 0/3] Perf avoid opening events on offline CPUs

From: Yicong Yang
Date: Tue Jun 04 2024 - 04:04:03 EST


On 2024/6/4 0:42, Ian Rogers wrote:
> On Mon, Jun 3, 2024 at 2:33 AM Yicong Yang <yangyicong@xxxxxxxxxx> wrote:
>>
>> From: Yicong Yang <yangyicong@xxxxxxxxxxxxx>
>>
>> If user doesn't specify the CPUs, perf will try to open events on CPUs
>> of the PMU which is initialized from the PMU's "cpumask" or "cpus" sysfs
>> attributes if provided. But we doesn't check whether the CPUs provided
>> by the PMU are all online. So we may open events on offline CPUs if PMU
>> driver provide offline CPUs and then we'll be rejected by the kernel:
>>
>> [root@localhost yang]# echo 0 > /sys/devices/system/cpu/cpu0/online
>
> Generally Linux won't let you take CPU0 off line, I'm not able to
> follow this step on x86 Linux. Fwiw, I routinely run perf with the
> core hyperthread siblings offline.
>

It doesn't matter if it's the CPU0 offline or other CPUs. There's no restriction
for CPU0 can go offline or not on arm64 and I just use this for example.

I cannot reproduce it on x86. I think it may because we're initializing the
pmu->cpus in different routines in pmu_cpumask(). There's no "cpus"
for x86's core pmu on my x86 machine:
root@ubuntu204:~# ls /sys/bus/event_source/devices/cpu/
allow_tsx_force_abort caps events format freeze_on_smi perf_event_mux_interval_ms power rdpmc subsystem type uevent

So pmu_cpumask() will infer it as an core pmu and initialize the cpus
with online CPUs [1]. For arm64 there lies a "cpus" sysfs attributes
so pmu->cpus are initialized from the "cpus" without checking each
CPUs is online or not. That's what proposed in Patch 1/3.

There's a "cpus" sysfs for x86's hybrid machine, reading from the code [2].
And it seems always reflect the online CPUs supported by that PMU.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmu.c?h=perf-tools-next#n779
[2] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree//arch/x86/events/intel/core.c#n5736

Thanks.