Re: [PATCH v1] drivers/perf: apple_m1: fix affinity table for event 0x96 and 0x9b

From: Will Deacon
Date: Mon Jul 08 2024 - 08:00:40 EST


On Tue, Jul 02, 2024 at 08:43:16PM +0800, Yangyu Chen wrote:
>
>
> > On Jul 2, 2024, at 20:13, Will Deacon <will@xxxxxxxxxx> wrote:
> >
> > On Tue, Jul 02, 2024 at 11:58:00AM +0100, Marc Zyngier wrote:
> >> On Tue, 02 Jul 2024 11:22:21 +0100,
> >> Yangyu Chen <cyy@xxxxxxxxxxxx> wrote:
> >>>
> >>>> Yangyu, can you please clarify how you came to the conclusion that
> >>>> these events didn't count anywhere other than counter 7?
> >>>>
> >>>
> >>> IIRC, I came across some web page that says events 0x96 and 0x9b
> >>> can only be installed on counter 7 to count Apple AMX, but I can't
> >>> find the page now. Since AMX is not usable in Linux, I don't know
> >>> if this will affect some other instructions that are usable in
> >>> Linux.
> >>
> >> As you said, AMX cannot be used with Linux, and that's unlikely to
> >> ever change. But when it comes to the standard ARM ISA, we can only
> >> witness counters 5,6 and 7 being incremented with at the exact same
> >> rate.
> >>
> >> So reading between the lines, what I understand is that AMX
> >> instructions would only have their effects counted in counter 7 for
> >> these events, while other instructions would be counted in all 3
> >> counters.
> >>
> >> By extension, such behaviour could be applied to SME on HW that
> >> supports it (wild guess).
> >>
> >>> There are some other reasons, but I can't say in public.
> >>
> >> Fair enough, I'm not asking for the disclosure of anything that isn't
> >> public (the least I know, the better).
> >>
> >>> Even though I can't find the actual usage, I think using count 7
> >>> only for these 2 events is safer. If this reason is insufficient,
> >>> we can ignore this patch until we find other evidence that this
> >>> affinity affects some instructions usable in Linux.
> >>
> >> I honestly don't mind.
> >>
> >> The whole thing is a black box, and is more useful as an interrupt
> >> generator than an actual PMU, due to the lack of freely available
> >> documentation. If the PMU maintainers want to merge this, I won't
> >> oppose it.
> >
> > I'd rather leave the code as-is than tweak specific counters based on
> > a combination of guesswork and partial information.
> >
> > Of course, if somebody who knows better wants to fix up all of the
> > mappings (because this surely isn't the only corner-case), then we can
> > take that. But at least what we have today has _some_ sort of consistent
> > rationale behind it.
>
> Actually, anyone who has macOS software can learn the whole affinity
> table of PMU. The detailed information can be extracted from a plist
> file stored in the macOS root filesystem. I also provide that script
> [1] to extract this information.
>
> However, I can't directly use this information for legal concerns.
> Would this be acceptable if the information I provide matches Apple's
> information? I can't say whether it matches or not in public. I can
> only say we can easily find someone who uploaded this file to the
> internet.
>
> [1] https://github.com/cyyself/m1-pmu-gen

I can't say I feel hugely comfortable with this, so I'll leave the code
as-is unless a patch shows up fixing all the events.

Thanks for the reply, though. You've clearly spent a bunch of effort on
this and it's a pity we can't easily apply your results to the driver :/

Will