On Thu, 2009-05-28 at 16:58 +0200, stephane eranian wrote:- uint64_t irq_period
IRQ is an x86 related name. Why not use smpl_period instead?
don't really care, but IRQ seems used throughout linux, we could name
the thing interrupt or sample period.
- uint32_t record_type
This field is a bitmask. I believe 32-bit is too small to accommodate
future record formats.
It currently controls 8 aspects of the overflow entry, do you really
forsee the need for more than 32?
I would assume that on the read() side, counts are accumulated as
64-bit integers. But if it is the case, then it seems there is an
asymmetry between period and counts.
Given that your API is high level, I don't think tools should have to
worry about the actual width of a counter. This is especially true
because they don't know which counters the event is going to go into
and if I recall correctly, on some PMU models, different counters can
have different width (Power, I think).
It is rather convenient for tools to always manipulate counters as
64-bit integers. You should provide a consistent view between counts
and periods.
So you're suggesting to artificually strech periods by say composing a
single overflow from smaller ones, ignoring the intermediate overflow
events?
That sounds doable, again, patch welcome.
4/ Grouping
By design, an event can only be part of one group at a time. Events in
a group are guaranteed to be active on the PMU at the same time. That
means a group cannot have more events than there are available counters
on the PMU. Tools may want to know the number of counters available in
order to group their events accordingly, such that reliable ratios
could be computed. It seems the only way to know this is by trial and
error. This is not practical.
Got a proposal to ammend this?
5/ Multiplexing and scaling
The PMU can be shared by multiple programs each controlling a variable
number of events. Multiplexing occurs by default unless pinned is
requested. The exclusive option only guarantees the group does not
share the PMU with other groups while it is active, at least this is
my understanding.
We have pinned and exclusive. pinned means always on the PMU, exclusive
means when on the PMU no-one else can be.
III/ Requests
2/ Sampling period randomization
It is our experience (on Itanium, for instance), that for certain
sampling measurements, it is beneficial to randomize the sampling
period a bit. This is in particular the case when sampling on an
event that happens very frequently and which is not related to
timing, e.g., branch_instructions_retired. Randomization helps mitigate
the bias. You do not need anything sophisticated.. But when you are using
a kernel-level sampling buffer, you need to have to kernel randomize.
Randomization needs to be supported per event.
Corey raised this a while back, I asked what kind of parameters were
needed and if a specific (p)RNG was specified.
Is something with an (avg,std) good enough? Do you have an
implementation that I can borrow, or even better a patch? :-)