Re: [tip: perf/core] perf: Generic hotplug support for a PMU with a scope

From: Steven Price
Date: Thu Sep 12 2024 - 06:13:01 EST


On 10/09/2024 10:59, tip-bot2 for Kan Liang wrote:
> The following commit has been merged into the perf/core branch of tip:
>
> Commit-ID: 4ba4f1afb6a9fed8ef896c2363076e36572f71da
> Gitweb: https://git.kernel.org/tip/4ba4f1afb6a9fed8ef896c2363076e36572f71da
> Author: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> AuthorDate: Fri, 02 Aug 2024 08:16:37 -07:00
> Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> CommitterDate: Tue, 10 Sep 2024 11:44:12 +02:00
>
> perf: Generic hotplug support for a PMU with a scope
>
> The perf subsystem assumes that the counters of a PMU are per-CPU. So
> the user space tool reads a counter from each CPU in the system wide
> mode. However, many PMUs don't have a per-CPU counter. The counter is
> effective for a scope, e.g., a die or a socket. To address this, a
> cpumask is exposed by the kernel driver to restrict to one CPU to stand
> for a specific scope. In case the given CPU is removed,
> the hotplug support has to be implemented for each such driver.
>
> The codes to support the cpumask and hotplug are very similar.
> - Expose a cpumask into sysfs
> - Pickup another CPU in the same scope if the given CPU is removed.
> - Invoke the perf_pmu_migrate_context() to migrate to a new CPU.
> - In event init, always set the CPU in the cpumask to event->cpu
>
> Similar duplicated codes are implemented for each such PMU driver. It
> would be good to introduce a generic infrastructure to avoid such
> duplication.
>
> 5 popular scopes are implemented here, core, die, cluster, pkg, and
> the system-wide. The scope can be set when a PMU is registered. If so, a
> "cpumask" is automatically exposed for the PMU.
>
> The "cpumask" is from the perf_online_<scope>_mask, which is to track
> the active CPU for each scope. They are set when the first CPU of the
> scope is online via the generic perf hotplug support. When a
> corresponding CPU is removed, the perf_online_<scope>_mask is updated
> accordingly and the PMU will be moved to a new CPU from the same scope
> if possible.
>
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Link: https://lore.kernel.org/r/20240802151643.1691631-2-kan.liang@xxxxxxxxxxxxxxx
> ---
> include/linux/perf_event.h | 18 ++++-
> kernel/events/core.c | 164 +++++++++++++++++++++++++++++++++++-
> 2 files changed, 180 insertions(+), 2 deletions(-)
>
[...]
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 67e115d..5ff9735 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
[...]
> @@ -13856,6 +13980,42 @@ static void perf_event_exit_cpu_context(int cpu) { }
>
> #endif
>
> +static void perf_event_setup_cpumask(unsigned int cpu)
> +{
> + struct cpumask *pmu_cpumask;
> + unsigned int scope;
> +
> + cpumask_set_cpu(cpu, perf_online_mask);
> +
> + /*
> + * Early boot stage, the cpumask hasn't been set yet.
> + * The perf_online_<domain>_masks includes the first CPU of each domain.
> + * Always uncondifionally set the boot CPU for the perf_online_<domain>_masks.
^^^^^^^^^^^^^^^ typo

> + */
> + if (!topology_sibling_cpumask(cpu)) {

This causes a compiler warning:

> kernel/events/core.c: In function 'perf_event_setup_cpumask':
> kernel/events/core.c:14012:13: error: the comparison will always evaluate as 'true' for the address of 'thread_sibling' will never be NULL [-Werror=address]
> 14012 | if (!topology_sibling_cpumask(cpu)) {
> | ^
> In file included from ./include/linux/topology.h:30,
> from ./include/linux/gfp.h:8,
> from ./include/linux/xarray.h:16,
> from ./include/linux/list_lru.h:14,
> from ./include/linux/fs.h:13,
> from kernel/events/core.c:11:
> ./include/linux/arch_topology.h:78:19: note: 'thread_sibling' declared here
> 78 | cpumask_t thread_sibling;
> | ^~~~~~~~~~~~~~
> cc1: all warnings being treated as errors

Steve