Re: [PATCH v3 RESEND] perf/core: Fix missing read event generation on task exit
From: Peter Zijlstra
Date: Tue Dec 09 2025 - 04:39:08 EST
On Tue, Dec 09, 2025 at 12:16:00PM +0800, Thaumy Cheng wrote:
> For events with inherit_stat enabled, a "read" event will be generated
> to collect per task event counts on task exit.
>
> The call chain is as follows:
>
> do_exit
> -> perf_event_exit_task
> -> perf_event_exit_task_context
> -> perf_event_exit_event
> -> perf_remove_from_context
> -> perf_child_detach
> -> sync_child_event
> -> perf_event_read_event
>
> However, the child event context detaches the task too early in
> perf_event_exit_task_context, which causes sync_child_event to never
> generate the read event in this case, since child_event->ctx->task is
> always set to TASK_TOMBSTONE. Fix that by moving context lock section
> backward to ensure ctx->task is not set to TASK_TOMBSTONE before
> generating the read event.
>
> Because perf_event_free_task calls perf_event_exit_task_context with
> exit = false to tear down all child events from the context, and the
> task never lived, accessing the task PID can lead to a use-after-free.
>
> To fix that, let sync_child_event read task from argument and move the
> call to the only place it should be triggered to avoid the effect of
> setting ctx->task to TASK_TOMESTONE, and add a task parameter to
> perf_event_exit_event to trigger the sync_child_event properly when
> needed.
>
> This bug can be reproduced by running "perf record -s" and attaching to
> any program that generates perf events in its child tasks. If we check
> the result with "perf report -T", the last line of the report will leave
> an empty table like "# PID TID", which is expected to contain the
> per-task event counts by design.
>
> Fixes: ef54c1a476ae ("perf: Rework perf_event_exit_event()")
> Signed-off-by: Thaumy Cheng <thaumy.love@xxxxxxxxx>
> ---
> kernel/events/core.c | 23 ++++++++++++++---------
> 1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 177e57c1a362..618e7947c358 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2316,7 +2316,8 @@ static void perf_group_detach(struct perf_event *event)
> perf_event__header_size(leader);
> }
>
> -static void sync_child_event(struct perf_event *child_event);
> +static void sync_child_event(struct perf_event *child_event,
> + struct task_struct *task);
This forward declaration can be entirely removed now.
Other than that, yes this seems fine. I see Ingo already picked up the
patch, thanks!