Re: [perfmon] Re: [PATCH 9/16] 2.6.17-rc6 perfmon2 patch for review: kernel-level API support (kapi)
From: Stephane Eranian
Date: Thu Jun 22 2006 - 08:19:22 EST
Christoph,
On Fri, Jun 16, 2006 at 04:45:19PM +0100, Christoph Hellwig wrote:
> On Fri, Jun 16, 2006 at 11:41:32AM -0400, Frank Ch. Eigler wrote:
> > Whether one uses systemtap, raw kprobes, or some specialized
> > tracing/stats-collecting patch surely forthcoming, kernel-level APIs
> > would be needed to perform fine-grained kernel-scope measurements
> > using these counters.
>
You do not need to be in the kernel to measure kernel level
execution. Monitoring is statistical by nature, this is not about capturing
execution traces. All PMU models have the capability to filter on privilege
levels so you can distinguish user from kernel.
To measure certain functions of the kernel, some PMU models provide a
way to restrict monitoring to a range of contiguous code addresses, e.g.
Itanium 2.
The case of systemtap is different. I think they would like to start/stop
monitoring on certain systemtap events, e.g., a function is called, a
threshold is met. Start and stop would be triggered from a systemtap
callback which is implemented by a kernel module, if I understand
the architecture. In the scenario, the monitoring session would have
to be created and controlled from the kernel. One could envision an
architecture, where monitoring would be controlled from user level
with systemtap making upcalls but I do not think this is possible given
that the instrumentation points can be very low level.
Another usage for a kernel-level monitoring API that I know about is
people who want to explore how to use the performance monitoring
(and profiles) to guide the scheduler. A thread profile can tell the cache
hit rates, stalls, bus bandwidth utilization, whether it uses flops and so on.
This could be useful to to find the best placement for threads and avoid co-scheduling
threads that trash each other's micro-architectural state or saturate the memory bus.
In this scenario, one could envision a kernel thread controlling monitoring
and processing profiles for the scheduler. But, to concur with you Christoph,
I think this could be achieved from user level and the valuable information
may be passed to the scheduler via a specific system call for instance.
> No, there's not need to add kernel bloat for performance monitoring.
> This kind of stuff shoul dabsolutely be done from userspace.
--
-Stephane
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/