Re: [RFC] [PATCH 1/1] perf: add support for arch-dependent symbolicevent names to "perf stat"

From: Corey Ashford
Date: Tue Mar 16 2010 - 14:25:00 EST


On 3/16/2010 2:40 AM, Ingo Molnar wrote:

* Corey Ashford<cjashfor@xxxxxxxxxxxxxxxxxx> wrote:

On 3/11/2010 12:46 PM, Corey Ashford wrote:


On 3/11/2010 11:14 AM, Ingo Molnar wrote:

* Corey Ashford<cjashfor@xxxxxxxxxxxxxxxxxx> wrote:
[snip]
I'm not sure how that would work. The issue I am trying to solve
here is that Power arch chips have a large number of very
hardware-specific events that are not generalizable. Many of these
events not only have names, but other user-configurable bits as well
that select or narrow the scope of which exact events are recorded.
This issue is dealt with nicely in libpfm4, as it has mechanisms for
parsing event names and attributes (aka modifiers or unit masks),
and then produces a usable config field for the perf_events_attr
struct.

Should I take it from the above that you are completely against the
idea of using an external library for hardware-specific event and
attribute naming?

Could you give a few relevant examples of events in question, and the
kind of
configurability/attributes they have on Power?

Here are a few examples for the Power A2 processor. I've distorted the
names because PMU architecture isn't publicly released yet.

PM_DE_PMC_9:hrd_mask=0xff:hrd=0x22:pma_mask=0x3fff:pma=0x1b2d:culling_mode=3

PM_EX_0x03:lane=2:vlane=1
PM_OWE_ENG_MAC_FULL:usu=3

Just a follow-up note to this...

I learned that much of the high-level architecture of the new
chip that IBM is working on has been publicly released recently, so
I have "undistorted" the event names below:

PM_DC_PMC_9:lpid_mask=0xff:lpid=0x22:pid_mask=0x3fff:pid=0x1b2d:marking_mode=3
PM_REGX_0x03:lane=2:vlane=1
PM_XML_ENG_MAC_FULL:sus=3


DC = Decompression/Compression accelerator
PMC_9 = Peformance monitoring event 9
REGX = Regular eXpression accelerator
XML = XML parsing accelerator
pid = process id to match
pid_mask = process id match mask
lpid = logical partition id
lpid_mask = logical partition id mask
sus = source unit select
lane, vlane = signal routing fields
marking_mode = used to determine which accelerator work units to
mark for performance monitoring

Are these special-purpose instructions for compression/regex/xml-parsing
speedups?

No, these events are for nest (aka uncore) accelerators for compression/regex/xml-parsing. These accelerators operate independently of the CPU threads and are given work units via request blocks which are then queued up by the accelerator.


I think it would be rather useful to merge the hw (and sw) perf events with
the ftrace/tracepoints symbolic events space. That would be a one-stop-shop
for both perf and other tools to figure out the events we offer, their
characteristics, format, relationship to other events, etc.

Ingo

Ok, I will look into this. Thank you for your advice.

- Corey

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/