RE: [PATCH] perf/x86/intel/uncore: Fix oops when counting IMC uncore events on some TGL

From: David Laight
Date: Wed May 27 2020 - 10:51:13 EST


From: Liang, Kan
> Sent: 27 May 2020 15:47
> On 5/27/2020 8:59 AM, David Laight wrote:
> > From: kan.liang@xxxxxxxxxxxxxxx
> >> Sent: 27 May 2020 13:31
> >>
> >> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> >>
> >> When counting IMC uncore events on some TGL machines, an oops will be
> >> triggered.
> >> [ 393.101262] BUG: unable to handle page fault for address:
> >> ffffb45200e15858
> >> [ 393.101269] #PF: supervisor read access in kernel mode
> >> [ 393.101271] #PF: error_code(0x0000) - not-present page
> >>
> >> Current perf uncore driver still use the IMC MAP SIZE inherited from
> >> SNB, which is 0x6000.
> >> However, the offset of IMC uncore counters for some TGL machines is
> >> larger than 0x6000, e.g. 0xd8a0.
> >>
> >> Enlarge the IMC MAP SIZE for TGL to 0xe000.
> >
> > Replacing one 'random' constant with a different one
> > doesn't seem like a proper fix.
> >
> > Surely the actual bounds of the 'memory' area are properly
> > defined somewhere.
> > Or at least should come from a table.
> >
> > You also need to verify that the offsets are within the mapped area.
> > An unexpected offset shouldn't try to access an invalid address.
>
> Thanks for the review.
>
> I agree that we should add a check before mapping the area to prevent
> the issue happens again.
>
> I think the check should be a generic check for all platforms which try
> to map an area, not just for TGL. I will submit a separate patch for the
> check.

You need a check that the actual access is withing the mapped area.
So instead of getting an OOPS you get a error.

This is after you've mapped it.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)