[PATCH] perf/x86/rapl: fix deadlock in rapl_pmu_event_stop
From: Duoming Zhou
Date: Sat Sep 17 2022 - 10:48:18 EST
There is a deadlock in rapl_pmu_event_stop(), the process is
shown below:
(thread 1) | (thread 2)
rapl_pmu_event_stop() | rapl_hrtimer_handle()
... | if (!pmu->n_active)
raw_spin_lock_irqsave() //(1) | ...
... |
hrtimer_cancel() | raw_spin_lock_irqsave() //(2)
(block forever)
We hold pmu->lock in position (1) and use hrtimer_cancel() to wait
rapl_hrtimer_handle() to stop, but rapl_hrtimer_handle() also need
pmu->lock in position (2). As a result, the rapl_pmu_event_stop()
will be blocked forever.
This patch extracts hrtimer_cancel() from the protection of
raw_spin_lock_irqsave(). As a result, the rapl_hrtimer_handle() could
obtain the pmu->lock. In order to prevent race conditions, we put
"if (!pmu->n_active)" in rapl_hrtimer_handle() under the protection
of raw_spin_lock_irqsave().
Fixes: 65661f96d3b3 ("perf/x86: Add RAPL hrtimer support")
Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx>
---
arch/x86/events/rapl.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index 77e3a47af5a..97c71538d01 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -219,11 +219,11 @@ static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
struct perf_event *event;
unsigned long flags;
+ raw_spin_lock_irqsave(&pmu->lock, flags);
+
if (!pmu->n_active)
return HRTIMER_NORESTART;
- raw_spin_lock_irqsave(&pmu->lock, flags);
-
list_for_each_entry(event, &pmu->active_list, active_entry)
rapl_event_update(event);
@@ -281,8 +281,11 @@ static void rapl_pmu_event_stop(struct perf_event *event, int mode)
if (!(hwc->state & PERF_HES_STOPPED)) {
WARN_ON_ONCE(pmu->n_active <= 0);
pmu->n_active--;
- if (pmu->n_active == 0)
+ if (!pmu->n_active) {
+ raw_spin_unlock_irqrestore(&pmu->lock, flags);
hrtimer_cancel(&pmu->hrtimer);
+ raw_spin_lock_irqsave(&pmu->lock, flags);
+ }
list_del(&event->active_entry);
--
2.17.1