Re: [PATCH 0/5] perf: Fix pmu for drivers with bind/unbind

From: Lucas De Marchi
Date: Fri Oct 11 2024 - 19:03:35 EST


On Fri, Oct 11, 2024 at 03:21:18PM -0700, Umesh Nerlige Ramappa wrote:
On Tue, Oct 08, 2024 at 01:34:56PM -0500, Lucas De Marchi wrote:
v2 of my attempt at fixing how i915 interacts with perf events.

v1 - https://lore.kernel.org/all/20240722210648.80892-1-lucas.demarchi@xxxxxxxxx/

From other people:
1) https://lore.kernel.org/lkml/20240115170120.662220-1-tvrtko.ursulin@xxxxxxxxxxxxxxx/
2) https://lore.kernel.org/all/20240213180302.47266-1-umesh.nerlige.ramappa@xxxxxxxxx/

WARNING: patches 1, 4 and 5 are NOT intended to be applied as is. More
on this below.

This series basically builds on the idea of the first patch of my
previous series, but extends it in a way that

a) the other patches are not needed (at least, not as is) and
b) driver can rebind just fine - no issues with the new call to
perf_pmu_register()

I have 2 broad questions:

(1) I am curious how (b) works. You seem to have a notion of instances of devices and then are you using the instance number to create the name used for the sysfs entry? Did I get that right?


humn... no. We just unregister the driver from pmu, so the name becomes
free for when the driver rebinds with the same event name.


If so, should the application discover what the name is each time it is run? In the failure case that I am seeing, I am running an application that does not work when I rename the sysfs entry to something else.

(2) Similar to Patch 1 of your v1 series where you modified _free_event:

static void _free_event(struct perf_event *event)
{
struct module *module;
...
module = event->pmu->module;
...
if (event->destroy)
event->destroy(event);
...
module_put(module);
...
}

With the above code, the kref to i915->pmu can be taken from the below points in i915 code (just to point out the sequence):

i915_pmu_register()
{
perf_pmu_register()
drm_dev_get()
kref_init()
}

i915_pmu_unregister()
{
kref_put(&ref, pmu_cleanup)
}

i915_pmu_event_init()
{
kref_get()
}

i915_pmu_event_destroy()
{
kref_put(&ref, pmu_cleanup)
}

void pmu_cleanup(struct kref *ref)
{
i915_pmu_unregister_cpuhp_state(pmu);
perf_pmu_unregister(&pmu->base);
pmu->base.event_init = NULL;
kfree(pmu->base.attr_groups);
if (!is_igp(i915))
kfree(pmu->name);
free_event_attributes(pmu);
drm_dev_put(&i915->drm);
}

Would this work? Do you see any gaps that may need the ref counting code you added in perf?


well... I just posted the fixes for i915 on top of these patches :)
You may want to look at https://lore.kernel.org/intel-gfx/20241011225430.1219345-1-lucas.demarchi@xxxxxxxxx/
It works for me on my machine with a DG2.

Lucas De Marchi