Re: [PATCH] perf evlist: fix memory corruption for Kernel PMU event

From: Namhyung Kim
Date: Mon Oct 05 2020 - 21:26:18 EST


Hello,

On Fri, Oct 2, 2020 at 12:02 PM Song Bao Hua (Barry Song)
<song.bao.hua@xxxxxxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Andi Kleen [mailto:ak@xxxxxxxxxxxxxxx]
> > Sent: Friday, October 2, 2020 12:07 PM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; Peter
> > Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; Arnaldo
> > Carvalho de Melo <acme@xxxxxxxxxx>; Mark Rutland
> > <mark.rutland@xxxxxxx>; Alexander Shishkin
> > <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>;
> > Namhyung Kim <namhyung@xxxxxxxxxx>; Adrian Hunter
> > <adrian.hunter@xxxxxxxxx>; Alexey Budankov
> > <alexey.budankov@xxxxxxxxxxxxxxx>
> > Subject: Re: [PATCH] perf evlist: fix memory corruption for Kernel PMU event
> >
> > On Fri, Oct 02, 2020 at 12:57:29AM +1300, Barry Song wrote:
> > > Commit 7736627b865d ("perf stat: Use affinity for closing file
> > > descriptors") will use FD(evsel, cpu, thread) to read and write file
> > > descriptors xyarray. For a kernel PMU event, this leads to serious
> > > memory corruption and perf crash.
> > > I have seen evlist->core.cpus->nr is 1 while evsel has cpus->nr with
> > > the total number of CPUs. so xyarray which is allocated by
> > > evlist->core.cpus->nr will get overflow. This leads to various
> > > segmentation faults in perf tool for kernel PMU events, eg:
> > > ./perf stat -e bus_cycles sleep 1
> > > *** Error in `./perf': free(): invalid next size (fast):
> > > 0x00000000401e6370 *** Aborted (core dumped)
> >
> > Thanks.
> >
> > I believe there is already a patch queued for this.
>
> Andi, thanks! Could you share the link or the commit ID? I'd like to take a look at the fix.
> I could still reproduce this issue in the latest linus' tree and I didn't find any commit
> related to this issue in linux-next and tip/perf/core.

I think Andi was referring to this discussion which is not merged yet:

https://lore.kernel.org/lkml/20200922031346.15051-2-liwei391@xxxxxxxxxx/

I suggested a patch at the end. Can you please try it?

Thanks
Namhyung