Re: [RFC][PATCH 00/11] perf pmu interface -v2
From: Corey Ashford
Date: Wed Jun 30 2010 - 13:19:41 EST
On 06/28/2010 08:13 AM, Peter Zijlstra wrote:
On Sat, 2010-06-26 at 09:22 -0700, Corey Ashford wrote:
As for the "hardware write batching", can you describe a bit more about
what you have in mind there? I wonder if this might have something to
do with accounting for PMU hardware which is slow to access, for
example, via I2C via an internal bridge.
Right, so the write batching is basically delaying writing out the PMU
state to hardware until pmu::pmu_enable() time. It avoids having to
re-program the hardware when, due to a scheduling constraint, we have to
move counters around.
So say, we context switch a task, and remove the old events and add the
new ones under a single pmu::pmu_disable()/::pmu_enable() pair, we will
only hit the hardware twice (once to disable, once to enable), instead
of for each individual ::del()/::add().
For this to work we need to have an association between a context and a
pmu, otherwise its very hard to know what pmu to disable/enable; the
alternative is all of them which isn't very attractive.
Then again, it doesn't make sense to have task-counters on an I2C
attached PMU simply because writing to the PMU could cause context
Thanks for your reply.
In our case, writing to some of the nest PMUs' control registers is done
via an internal bridge. We write to a specific memory address and an
internal MMIO-to-SCOM bridge (SCOM is similar to I2C) translates it to
serial and sends it over the internal serial bus. The address we write
to is based upon the control register's serial bus address, plus an
offset from the base of MMIO-to-SCOM bridge. The same process works for
While it does not cause a context switch because there are no IO drivers
to go through, it will take several thousand CPU cycles to complete,
which by the same token, still makes them inappropriate for
task-counters (not to mention that the nest units operate asynchronously
from the CPU).
However, there still are concerns relative to writing these control
registers from an interrupt handler because of the latency that will be
incurred, however slow we choose to do the event rotation. So at least
for the Wire-Speed processor, we may need a worker thread of some sort
to hand off the work to.
Our current code, based on linux 2.6.31 (soon to be 2.6.32) doesn't use
worker threads; we are just taking the latency hit for now.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/