Re: [RFC] perf_events: support for uncore a.k.a. nest units

From: Corey Ashford
Date: Thu Jan 28 2010 - 14:44:47 EST

On 1/28/2010 11:06 AM, Peter Zijlstra wrote:
On Thu, 2010-01-28 at 10:00 -0800, Corey Ashford wrote:

I don't quite get what you're saying here. Perhaps you are thinking
that all uncore units are associated with a particular cpu node, or a
set of cpu nodes? And that there's only one uncore unit per cpu (or set
of cpus) that needs to be addressed, i.e. no ambiguity?

Well, I was initially thinking of the intel uncore thing which is memory
controller, so node, level.

But all system topology bound pmus can be done that way.

That is not going to be the case for all systems. We can have uncore
units that are associated with the entire system,

Right, but that's simple too.

for example PMUs in an I/O device.

And we can have multiple uncore units of a particular
type, for example multiple vector coprocessors, each with its own PMU,
and are associated with a single cpu or a set of cpus.

perf_events needs an addressing scheme that covers these cases.

You could possible add a u64 pmu_id field to perf_event_attr and use
that together with things like:

PERF_TYPE_PCI, attr.pmu_id = domain:bus:device:function encoding
PERF_TYPE_SPU, attr.pmu_id = spu-id

Thank you for that clarification.

One of Ingo's comments was that he wants perf to be able to expose all of the available PMUs via the perf tool. That perf should be able to parse some data structure (somewhere) that would contain all of the info the user would need to choose a particular PMU. Do you have some ideas about how that could be accomplished using the above encoding scheme? I can see how it would be fairly easy to come up with a PERF_TYPE_* encoding per-topology, and then interpret all of those bits correctly within the kernel (which is saavy to that topology), but I don't see how there would be a straight-forward way to expose that structure to perf. How would perf know which of those encodings apply to the current system, how many PMUs there are of each type, etc.

That's why I'm leaning toward a /sys/devices-style pseudo fs at the moment. If there's a simpler, better way, I'm open to it.

But before we go there the perf core needs to be extended to deal with
multiple hardware pmus, something which isn't too hard but we need to be
careful not to bloat the normal code paths for these somewhat esoteric
use cases.

Is this something you are looking into?

- Corey

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at