[tip: perf/core] perf: Fix event leak upon exit

From: tip-bot2 for Frederic Weisbecker
Date: Tue Jul 09 2024 - 07:43:38 EST


The following commit has been merged into the perf/core branch of tip:

Commit-ID: 2fd5ad3f310de22836cdacae919dd99d758a1f1b
Gitweb: https://git.kernel.org/tip/2fd5ad3f310de22836cdacae919dd99d758a1f1b
Author: Frederic Weisbecker <frederic@xxxxxxxxxx>
AuthorDate: Fri, 21 Jun 2024 11:16:00 +02:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Tue, 09 Jul 2024 13:26:33 +02:00

perf: Fix event leak upon exit

When a task is scheduled out, pending sigtrap deliveries are deferred
to the target task upon resume to userspace via task_work.

However failures while adding an event's callback to the task_work
engine are ignored. And since the last call for events exit happen
after task work is eventually closed, there is a small window during
which pending sigtrap can be queued though ignored, leaking the event
refcount addition such as in the following scenario:

TASK A
-----

do_exit()
exit_task_work(tsk);

<IRQ>
perf_event_overflow()
event->pending_sigtrap = pending_id;
irq_work_queue(&event->pending_irq);
</IRQ>
=========> PREEMPTION: TASK A -> TASK B
event_sched_out()
event->pending_sigtrap = 0;
atomic_long_inc_not_zero(&event->refcount)
// FAILS: task work has exited
task_work_add(&event->pending_task)
[...]
<IRQ WORK>
perf_pending_irq()
// early return: event->oncpu = -1
</IRQ WORK>
[...]
=========> TASK B -> TASK A
perf_event_exit_task(tsk)
perf_event_exit_event()
free_event()
WARN(atomic_long_cmpxchg(&event->refcount, 1, 0) != 1)
// leak event due to unexpected refcount == 2

As a result the event is never released while the task exits.

Fix this with appropriate task_work_add()'s error handling.

Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Link: https://lore.kernel.org/r/20240621091601.18227-4-frederic@xxxxxxxxxx
---
kernel/events/core.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 51ce436..576400d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2284,18 +2284,15 @@ event_sched_out(struct perf_event *event, struct perf_event_context *ctx)
}

if (event->pending_sigtrap) {
- bool dec = true;
-
event->pending_sigtrap = 0;
if (state != PERF_EVENT_STATE_OFF &&
- !event->pending_work) {
- event->pending_work = 1;
- dec = false;
+ !event->pending_work &&
+ !task_work_add(current, &event->pending_task, TWA_RESUME)) {
WARN_ON_ONCE(!atomic_long_inc_not_zero(&event->refcount));
- task_work_add(current, &event->pending_task, TWA_RESUME);
- }
- if (dec)
+ event->pending_work = 1;
+ } else {
local_dec(&event->ctx->nr_pending);
+ }
}

perf_event_set_state(event, state);