Re: [PATCH v3 2/3] perf: Add driver for Arm NI-700 interconnect PMU

From: Robin Murphy
Date: Wed Sep 04 2024 - 13:16:08 EST


On 2024-09-04 1:24 pm, Will Deacon wrote:
On Mon, Sep 02, 2024 at 07:47:18PM +0100, Robin Murphy wrote:
On 02/09/2024 3:47 pm, Will Deacon wrote:
+static ssize_t arm_ni_format_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct arm_ni_format_attr *fmt = container_of(attr, typeof(*fmt), attr);
+ int lo = __ffs(fmt->field), hi = __fls(fmt->field);
+
+ return sysfs_emit(buf, "config:%d-%d\n", lo, hi);
+}

Nit: if you end up adding single-bit config fields in the future, this
will quietly do the wrong thing. Maybe safe-guard the 'lo==hi' case (even
if you just warn once and return without doing anything).

The counter-argument is that I don't foresee having any reason to add
single-bit config fields here in future, nor indeed config1 or config2
fields, so I intentionally pruned the would-be dead code while copy-pasting
this implementation from arm-cmn. Yes, if someone were to make an incomplete
change without paying attention or testing they could introduce a bug, but
when is that ever not true?

I guess I'm just a little more wary when it comes to UAPI. Somebody starts
relying on the broken message and then you're toast. It's also incredibly
easy to avoid by construction and the dead code isn't hurting anybody.

Even presuming some theoretical unreviewed broken patch did get merged and into real-world kernels without ever being fixed, I still struggle to imagine how userspace could somehow grow to *rely* on one PMU driver displaying a format attribute in an unexpected manner inconsistent with every other PMU driver, as opposed to the far more likely scenario of going wrong trying to parse it.

Anyway, after playing with some fun compile-time checks yesterday I've just realised there is actually an even simpler solution for doing the right thing in general, so I guess thanks for leaning on this :)

+ name = devm_kasprintf(ni->dev, GFP_KERNEL, "arm_ni_%d_cd_%d", ni->id, cd->id);
+ if (!name)
+ return -ENOMEM;
+
+ err = cpuhp_state_add_instance(arm_ni_hp_state, &cd->cpuhp_node);
+ if (err)
+ return err;

What happens if there's a CPU hotplug operation here? Can we end up calling
perf_pmu_migrate_context() concurrently with perf_pmu_register()?

Yes. Alternatively we could register the PMU before the hotplug handler,
then potentially miss a hotplug event and leave a user-visible PMU
associated with an invalid CPU. This is a known issue for all system PMU
drivers, and the conclusion 5 years ago was that it's impractical to close
this race from outside perf core itself[1][2].

Ok, I'm going to walk right into the trap you've set me...

Why can't we prevent hotplug (e.g. with cpus_read_lock()) while we're
setting this up?

... and climbing back out of that trap, is the conversation you had with
Thomas written down anywhere?

I think that must have been the time we talked in person in Cambridge. I can't now remember if there is (or was) anything in perf_pmu_register() that's actually incompatible with being called under cpus_read_lock(), but I do recall that the overall concept of exporting more bits of the hotplug machinery in order to copy-paste the same boilerplate bodge in every PMU driver was... unpopular.

I don't want to block this patch, but if five years has passed with
nobody looking at this then we probably need to address that at some
point before adding more and more broken drivers.

Looking at it again, is there actually a problem with the current state of things? Following through the call path:

perf_pmu_migrate_context()
__perf_pmu_remove()
perf_event_groups_first()
__group_cmp()
perf_event_groups_cmp()

the "pmu" pointer only seems to be used as a key to match events in the current CPU context, which obviously won't find anything at this point. AFAICS it's never even dereferenced, unless any events *are* found for __perf_pmu_install() to do something with, which necessarily implies an initialised and registered PMU for them to have been opened in the first place.

Thanks,
Robin.