[PATCH AUTOSEL 4.4 6/6] powerpc/perf: Fix soft lockups due to missed interrupt accounting

From: Sasha Levin
Date: Mon Aug 24 2020 - 12:41:47 EST


From: Athira Rajeev <atrajeev@xxxxxxxxxxxxxxxxxx>

[ Upstream commit 17899eaf88d689529b866371344c8f269ba79b5f ]

Performance monitor interrupt handler checks if any counter has
overflown and calls record_and_restart() in core-book3s which invokes
perf_event_overflow() to record the sample information. Apart from
creating sample, perf_event_overflow() also does the interrupt and
period checks via perf_event_account_interrupt().

Currently we record information only if the SIAR (Sampled Instruction
Address Register) valid bit is set (using siar_valid() check) and
hence the interrupt check.

But it is possible that we do sampling for some events that are not
generating valid SIAR, and hence there is no chance to disable the
event if interrupts are more than max_samples_per_tick. This leads to
soft lockup.

Fix this by adding perf_event_account_interrupt() in the invalid SIAR
code path for a sampling event. ie if SIAR is invalid, just do
interrupt check and don't record the sample information.

Reported-by: Alexey Kardashevskiy <aik@xxxxxxxxx>
Signed-off-by: Athira Rajeev <atrajeev@xxxxxxxxxxxxxxxxxx>
Tested-by: Alexey Kardashevskiy <aik@xxxxxxxxx>
Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/1596717992-7321-1-git-send-email-atrajeev@xxxxxxxxxxxxxxxxxx
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
arch/powerpc/perf/core-book3s.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 30e2e8efbe6b7..aab13558e9700 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2040,6 +2040,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val,

if (perf_event_overflow(event, &data, regs))
power_pmu_stop(event, 0);
+ } else if (period) {
+ /* Account for interrupt in case of invalid SIAR */
+ if (perf_event_account_interrupt(event))
+ power_pmu_stop(event, 0);
}
}

--
2.25.1