Re: [PATCH v3] perf/core: Fix reentry problem in perf_output_read_group

From: Yang Jihong
Date: Tue Aug 16 2022 - 23:18:43 EST


Hello,

On 2022/8/16 22:13, Peter Zijlstra wrote:
On Tue, Aug 16, 2022 at 05:11:03PM +0800, Yang Jihong wrote:
perf_output_read_group may respond to IPI request of other cores and invoke
__perf_install_in_context function. As a result, hwc configuration is modified.
As a result, the hwc configuration is modified, causing inconsistency and
unexpected consequences.

read_pmevcntrn+0x1e4/0x1ec arch/arm64/kernel/perf_event.c:423
armv8pmu_read_evcntr arch/arm64/kernel/perf_event.c:467 [inline]
armv8pmu_read_hw_counter arch/arm64/kernel/perf_event.c:475 [inline]
armv8pmu_read_counter+0x10c/0x1f0 arch/arm64/kernel/perf_event.c:528
armpmu_event_update+0x9c/0x1bc drivers/perf/arm_pmu.c:247
armpmu_read+0x24/0x30 drivers/perf/arm_pmu.c:264
perf_output_read_group+0x4cc/0x71c kernel/events/core.c:6806
perf_output_read+0x78/0x1c4 kernel/events/core.c:6845
perf_output_sample+0xafc/0x1000 kernel/events/core.c:6892
__perf_event_output kernel/events/core.c:7273 [inline]
perf_event_output_forward+0xd8/0x130 kernel/events/core.c:7287
__perf_event_overflow+0xbc/0x20c kernel/events/core.c:8943
perf_swevent_overflow kernel/events/core.c:9019 [inline]
perf_swevent_event+0x274/0x2c0 kernel/events/core.c:9047
do_perf_sw_event kernel/events/core.c:9160 [inline]
___perf_sw_event+0x150/0x1b4 kernel/events/core.c:9191
__perf_sw_event+0x58/0x7c kernel/events/core.c:9203
perf_sw_event include/linux/perf_event.h:1177 [inline]

Interrupts is not disabled when perf_output_read_group reads PMU counter.

s/is/are/ due to 'interrupts' being plural
Ok, will fix in next version.

Anyway, yes, I suppose this is indeed so. That code expects to run with
IRQs disabled but in the case of software events that isn't so.

Do we need to determine whether it is a software event?
It feels like it's simply disable IRQs interrupts, with little impact.
In this case, IPI request may be received from other cores.
As a result, PMU configuration is modified and an error occurs when
reading PMU counter:

<SNIP>


Thanks,
Yang