Re: [PATCH 1/2] perf/x86/intel: Fix a crash caused by zero PEBS status

From: Namhyung Kim
Date: Tue Mar 16 2021 - 23:01:42 EST


On Fri, Mar 12, 2021 at 05:21:37AM -0800, kan.liang@xxxxxxxxxxxxxxx wrote:
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>
> A repeatable crash can be triggered by the perf_fuzzer on some Haswell
> system.
> https://lore.kernel.org/lkml/7170d3b-c17f-1ded-52aa-cc6d9ae999f4@xxxxxxxxx/
>
> For some old CPUs (HSW and earlier), the PEBS status in a PEBS record
> may be mistakenly set to 0. To minimize the impact of the defect, the
> commit was introduced to try to avoid dropping the PEBS record for some
> cases. It adds a check in the intel_pmu_drain_pebs_nhm(), and updates
> the local pebs_status accordingly. However, it doesn't correct the PEBS
> status in the PEBS record, which may trigger the crash, especially for
> the large PEBS.
>
> It's possible that all the PEBS records in a large PEBS have the PEBS
> status 0. If so, the first get_next_pebs_record_by_bit() in the
> __intel_pmu_pebs_event() returns NULL. The at = NULL. Since it's a large
> PEBS, the 'count' parameter must > 1. The second
> get_next_pebs_record_by_bit() will crash.
>
> Besides the local pebs_status, correct the PEBS status in the PEBS
> record as well.
>
> Fixes: 01330d7288e0 ("perf/x86: Allow zero PEBS status with only single active event")
> Reported-by: Vince Weaver <vincent.weaver@xxxxxxxxx>
> Suggested-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx

Tested-by: Namhyung Kim <namhyung@xxxxxxxxxx>

Thanks,
Namhyung


> ---
> arch/x86/events/intel/ds.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> index 7ebae18..bcf4fa5 100644
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -2010,7 +2010,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> */
> if (!pebs_status && cpuc->pebs_enabled &&
> !(cpuc->pebs_enabled & (cpuc->pebs_enabled-1)))
> - pebs_status = cpuc->pebs_enabled;
> + pebs_status = p->status = cpuc->pebs_enabled;
>
> bit = find_first_bit((unsigned long *)&pebs_status,
> x86_pmu.max_pebs_events);
> --
> 2.7.4
>