Re: [PATCH v8 1/3] perf: cavium: Support memory controller PMU counters

From: Jan Glauber
Date: Wed Jul 26 2017 - 11:13:37 EST


On Wed, Jul 26, 2017 at 04:55:22PM +0200, Borislav Petkov wrote:
> On Wed, Jul 26, 2017 at 03:35:25PM +0100, Suzuki K Poulose wrote:
> > So the Cavium EDACs, which appear as PCI devices have a PMU attached to it.
>
> Cavium EDACs?
>
> So let me set something straight first: An EDAC driver simply talks to
> some RAS IP block and reports errors. It shouldn't have anything to do
> with a PMU.
>
> > In order to build this PMU driver as a module, we need a way to load the module
> > automatically based on the PCI id. However, since the EDAC driver already
> > registers with that PCI id, we cannot use the same for the PMU. Ideally,
>
> So this is strange. There's a single PCI ID but multiple functionalities
> behind it?

Yes, but I would still not call a memory controller a RAS IP block.
There are a number of registers on the memory controller (or on the OCX
TLK interconnect), and while some of them are RAS related there are also
other registers in the same device like the counters we want to access
via PMU code.

> > the PMU driver should be loaded when the EDAC driver is loaded. But looking
> > at the links above, it looks like you don't like the idea of triggering a
> > probe of the PMU component from the EDAC driver. We may be able to get rid
> > of the PMU specific information from the EDAC driver by maintaining the PCI
> > id of the device in the PMU driver. But we may still need to make sure that
> > the PMU driver gets a chance to probe the PMU when the device is available.
> >
> > What do you think is the best option here ?
>
> Can either of the two - EDAC or PMU driver - use an alternate detection
> method?

I'm currently using pci_get_device(vendor-ID, device-ID, ...) which
works fine.

> For example, we moved the main x86 EDAC drivers we moved to x86 CPU
> family, model, stepping detection from PCI IDs because the PCI IDs were
> clumsy to use.

I'm also looking for CPU implementor (MIDR), I could check for the model
too but I still need to detect devices based on PCI IDs as the model
check is not sufficient here (only multi-socket ThunderX has OCX TLK
devices).

--Jan

> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.
> --