Re: [PATCH v3 7/7] perf: Make perf_pmu_unregister() useable
From: Peter Zijlstra
Date: Mon Mar 10 2025 - 12:17:03 EST
On Mon, Mar 10, 2025 at 09:05:45PM +0530, Ravi Bangoria wrote:
> > @@ -207,6 +207,7 @@ static void perf_ctx_unlock(struct perf_
> > }
> >
> > #define TASK_TOMBSTONE ((void *)-1L)
> > +#define EVENT_TOMBSTONE ((void *)-1L)
> >
> > static bool is_kernel_event(struct perf_event *event)
> > {
> > @@ -2348,6 +2349,11 @@ static void perf_child_detach(struct per
> >
> > sync_child_event(event);
> > list_del_init(&event->child_list);
> > + /*
> > + * Cannot set to NULL, as that would confuse the situation vs
> > + * not being a child event. See for example unaccount_event().
> > + */
> > + event->parent = EVENT_TOMBSTONE;
>
> This will cause issues where we do `event = event->parent`. No? For ex:
>
> perf_pmu_unregister()
> ...
> perf_event_exit_event()
> perf_event_wakeup()
> ring_buffer_wakeup()
> if (event->parent)
> event = event->parent;
It will. However, if we do not do this, we have a potential
use-after-free, and people seem to take a dim view of those too :-)
I had convinced myself all paths that were doing this 'event =
event->parent' were unreachable by the time we set the TOMBSTONE, but
clearly I missed one.
I suppose we can do something like so, not really pretty though.
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -13722,9 +13722,12 @@ perf_event_exit_event(struct perf_event
{
struct perf_event *parent_event = event->parent;
unsigned long detach_flags = DETACH_EXIT;
+ bool parent_dead = false;
- if (parent_event == EVENT_TOMBSTONE)
+ if (parent_event == EVENT_TOMBSTONE) {
parent_event = NULL;
+ parent_dead = true;
+ }
if (parent_event) {
/*
@@ -13748,6 +13751,9 @@ perf_event_exit_event(struct perf_event
perf_remove_from_context(event, detach_flags);
+ if (parent_dead)
+ return;
+
/*
* Child events can be freed.
*/