Re: [RFC] perf_events: support for uncore a.k.a. nest units

From: Corey Ashford
Date: Fri Jan 29 2010 - 18:05:57 EST

On 1/29/2010 1:52 AM, Peter Zijlstra wrote:
On Thu, 2010-01-28 at 14:08 -0800, Corey Ashford wrote:

This is one of the reasons why I am leaning toward a /sys/devices-style data
structure; the kernel could easily build it based on the pmus that it discovers
(through whatever means), and the user can fairly easily choose a pmu from this
structure to open, and it's unambiguous to the kernel as to which pmu the user
really wants.

Well, the dumb way is simply probing all of them and see who responds.

That can work, but it's still fuzzy to me how a user would relate a PMU address that he's encoded to some actual device in the system he's using. How would he know that he's addressing the correct device (besides that the PMU type matches), given that we're likely to have hypervisors as middle-men.

Another might be adding a pmu attribute (showing the pmu-id) to the
existing sysfs topology layouts (system topology, pci, spu, are all
already available in sysfs iirc).

So you'd read the id from the sysfs topology tree, and then pass that id to the interface? That's an interesting approach that eliminates the need to pass a string pmu path to the kernel.

I like this idea, but I need to read more deeply about the topology entries to understand how they work.

I am not convinced that this is the right place to put the event info for each PMU.

Right, I'm not at all sure the kernel wants to know about any events
beyond those needed for pmu scheduling constraints and possible generic
event maps.

Clearly it needs to know about all software events, but I don't think we
need nor want exhaustive hardware event lists in the kernel.

> But before we go there the perf core needs to be extended to deal with
> multiple hardware pmus, something which isn't too hard but we need to be
> careful not to bloat the normal code paths for these somewhat esoteric
> use cases.

Is this something you've looked into? If so, what sort of issues have you

I've poked at it a little yes, while simply abstracting the current hw
interface and making it a list of pmu's isn't hard at all, it does add
overhead to a few key locations.

Another aspect is event scheduling, you'd want to separate the event
lists for the various pmus so that the RR thing works as expected, this
again adds overhead because you now need to abstract out the event lists
as well.

The main fast path affected by both these things is the task switch
event scheduling where you have to iterate all active events and their

So while the abstraction itself isn't too hard, doing it so as to
minimize the bloat on the key paths does make it interesting.


Thanks for your comments.

- Corey

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at