Re: [tip: perf/urgent] perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()

From: Peter Zijlstra

Date: Sat Dec 13 2025 - 07:06:07 EST


On Fri, Dec 12, 2025 at 09:04:41AM -0000, tip-bot2 for Evan Li wrote:
> The following commit has been merged into the perf/urgent branch of tip:
>
> Commit-ID: 9415f749d34b926b9e4853da1462f4d941f89a0d
> Gitweb: https://git.kernel.org/tip/9415f749d34b926b9e4853da1462f4d941f89a0d
> Author: Evan Li <evan.li@xxxxxxxxxxxxxxxxx>
> AuthorDate: Fri, 12 Dec 2025 16:49:43 +08:00
> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> CommitterDate: Fri, 12 Dec 2025 09:57:39 +01:00
>
> perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()
>
> handle_pmi_common() may observe an active bit set in cpuc->active_mask
> while the corresponding cpuc->events[] entry has already been cleared,
> which leads to a NULL pointer dereference.
>
> This can happen when interrupt throttling stops all events in a group
> while PEBS processing is still in progress. perf_event_overflow() can
> trigger perf_event_throttle_group(), which stops the group and clears
> the cpuc->events[] entry, but the active bit may still be set when
> handle_pmi_common() iterates over the events.
>
> The following recent fix:
>
> 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss")
>
> moved the cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del() and
> relied on cpuc->active_mask/pebs_enabled checks. However,
> handle_pmi_common() can still encounter a NULL cpuc->events[] entry
> despite the active bit being set.
>
> Add an explicit NULL check on the event pointer before using it,
> to cover this legitimate scenario and avoid the NULL dereference crash.
>
> Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss")
> Reported-by: kitta <kitta@xxxxxxxxxxxxxxxxx>
> Co-developed-by: kitta <kitta@xxxxxxxxxxxxxxxxx>
> Signed-off-by: Evan Li <evan.li@xxxxxxxxxxxxxxxxx>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
> Link: https://patch.msgid.link/20251212084943.2124787-1-evan.li@xxxxxxxxxxxxxxxxx
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220855
> ---
> arch/x86/events/intel/core.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 853fe07..bdf3f0d 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3378,6 +3378,9 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
>
> if (!test_bit(bit, cpuc->active_mask))
> continue;
> + /* Event may have already been cleared: */
> + if (!event)
> + continue;

I still hate this commit -- it doesn't actually explain anything, at
best it papers over an issue elsewhere :-(