Re: [PATCH 1/2] perf/arm-ni: Don't crash in probing clock domains without a PMU instance

From: Robin Murphy

Date: Mon Jan 26 2026 - 11:34:54 EST


On 2026-01-26 3:30 am, Baisheng Gao wrote:
The NULL pmusela pointer implies that current clock domain doesn't have
a PMU instance. Return 0 for probing the next clock domain. Otherwise a
kernel crash will happen.

Sorry, this doesn't add up with the diff below. All of the documentation says that the PMU is in integral part of the clock domain, and I can find no mention of any configuration parameter allowing it to be omitted. It is possible for the PMU registers to be inaccessible because Non-Secure access has not been enabled, but we account for that already.

Signed-off-by: Baisheng Gao <baisheng.gao@xxxxxxxxxx>
---
drivers/perf/arm-ni.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/perf/arm-ni.c b/drivers/perf/arm-ni.c
index 66858c65215d..53b656983da1 100644
--- a/drivers/perf/arm-ni.c
+++ b/drivers/perf/arm-ni.c
@@ -526,6 +526,7 @@ static int arm_ni_init_cd(struct arm_ni *ni, struct arm_ni_node *node, u64 res_s
{
struct arm_ni_cd *cd = ni->cds + node->id;
const char *name;
+ static atomic_t id;
cd->id = node->id;
cd->num_units = node->num_components;
@@ -562,6 +563,11 @@ static int arm_ni_init_cd(struct arm_ni *ni, struct arm_ni_node *node, u64 res_s
case NI_TMNI:
case NI_CMNI:
unit->pmusela = arm_ni_get_pmusel(ni, unit_base);
+ if (!unit->pmusela) {

...However this is not about the PMU node anyway; this would represent the FCU at an interface node being missing. Again, it's possible for access to the FCU itself to be restricted, per the test below, but the subfeature ID registers should always be readable, and per the "Should be impossible" comment in arm_ni_get_pmusel(), the nodes that can generate PMU events should always include an FCU.

Could you please clarify some more details of what the exact situation is that you're trying to deal with here?

+ dev_info(ni->dev, "No have PMU %d\n", cd->id);
+ devm_kfree(ni->dev, cd->units);
+ return 0;
+ }
writel_relaxed(1, unit->pmusela);
if (readl_relaxed(unit->pmusela) != 1)
dev_info(ni->dev, "No access to node 0x%04x%04x\n", unit->id, unit->type);
@@ -591,7 +597,7 @@ static int arm_ni_init_cd(struct arm_ni *ni, struct arm_ni_node *node, u64 res_s
writel_relaxed(U32_MAX, cd->pmu_base + NI_PMCNTENCLR);
writel_relaxed(U32_MAX, cd->pmu_base + NI_PMOVSCLR);
- cd->irq = platform_get_irq(to_platform_device(ni->dev), cd->id);
+ cd->irq = platform_get_irq(to_platform_device(ni->dev), atomic_fetch_inc(&id));

This is clearly wrong. Disregarding how badly it would go with multiple NI instances, even within a single instance I don';t think there's any obvious guarantee of a stable order. The firmware bindings are already defined, and that definition is not "the order in which a particular version of the Linux driver happens to parse things".

Thanks,
Robin.

if (cd->irq < 0)
return cd->irq;