[tip: perf/core] perf/core: Fix kernel register info leak via hardware skid
From: tip-bot2 for Dapeng Mi
Date: Tue Jun 30 2026 - 05:13:55 EST
The following commit has been merged into the perf/core branch of tip:
Commit-ID: de97a56a9fbd3a01e942057426ea749c282231b0
Gitweb: https://git.kernel.org/tip/de97a56a9fbd3a01e942057426ea749c282231b0
Author: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
AuthorDate: Fri, 12 Jun 2026 17:01:13 +08:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Tue, 30 Jun 2026 10:57:07 +02:00
perf/core: Fix kernel register info leak via hardware skid
An unprivileged hardware perf event using exclude_kernel=1 can leak kernel
register data to user space via PERF_SAMPLE_REGS_INTR or PERF_SAMPLE_IP.
Due to hardware skid, a PMI may trigger after the CPU has already entered
kernel space (Ring 0), bypassing the perf_allow_kernel() privilege
barrier.
This security vulnerability is severely exacerbated by upcoming support
for SIMD register sampling via XSAVES, which could expose sensitive kernel
FPU states (such as active cryptographic keys).
Fix this by ensuring that sampled register data is dropped if the event's
exclude_kernel attribute is set but the PMI catches the CPU in kernel mode.
Signed-off-by: Dapeng Mi <dapeng1.mi@xxxxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/all/20260529085613.CCAFB1F00893@xxxxxxxxxxxxxxx/
Link: https://patch.msgid.link/20260612090114.3188886-8-dapeng1.mi@xxxxxxxxxxxxxxx
---
kernel/events/core.c | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 954c36e..dd3bd9c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7791,10 +7791,20 @@ unsigned long perf_misc_flags(struct perf_event *event,
unsigned long perf_instruction_pointer(struct perf_event *event,
struct pt_regs *regs)
{
- if (should_sample_guest(event))
- return perf_guest_get_ip();
+ /*
+ * Hardware skid can lead to a scenario where a PMI is
+ * delivered after the CPU has already entered kernel mode.
+ * In that case, user-space sampling must not expose kernel
+ * register state.
+ */
+ if (should_sample_guest(event)) {
+ return event->attr.exclude_kernel &&
+ !(perf_guest_state() & PERF_GUEST_USER) ?
+ 0 : perf_guest_get_ip();
+ }
- return perf_arch_instruction_pointer(regs);
+ return event->attr.exclude_kernel && !user_mode(regs) ?
+ 0 : perf_arch_instruction_pointer(regs);
}
static void
@@ -7828,10 +7838,22 @@ static void perf_sample_regs_user(struct perf_regs *regs_user,
}
static void perf_sample_regs_intr(struct perf_regs *regs_intr,
- struct pt_regs *regs)
+ struct pt_regs *regs,
+ bool exclude_kernel)
{
- regs_intr->regs = regs;
- regs_intr->abi = perf_reg_abi(current);
+ /*
+ * Hardware skid can lead to a scenario where a PMI is
+ * delivered after the CPU has already entered kernel mode.
+ * In that case, user-space sampling must not expose kernel
+ * register state.
+ */
+ if (exclude_kernel && !user_mode(regs)) {
+ regs_intr->abi = PERF_SAMPLE_REGS_ABI_NONE;
+ regs_intr->regs = NULL;
+ } else {
+ regs_intr->regs = regs;
+ regs_intr->abi = perf_reg_abi(current);
+ }
}
@@ -8722,7 +8744,8 @@ void perf_prepare_sample(struct perf_sample_data *data,
/* regs dump ABI info */
int size = sizeof(u64);
- perf_sample_regs_intr(&data->regs_intr, regs);
+ perf_sample_regs_intr(&data->regs_intr, regs,
+ event->attr.exclude_kernel);
if (data->regs_intr.regs) {
u64 mask = event->attr.sample_regs_intr;