Re: perf_counters issue with PERF_SAMPLE_GROUP

From: stephane eranian
Date: Thu Aug 13 2009 - 15:15:41 EST


On Thu, Aug 13, 2009 at 11:46 AM, Ingo Molnar<mingo@xxxxxxx> wrote:
>
> * stephane eranian <eranian@xxxxxxxxxxxxxx> wrote:
>
>> On Wed, Aug 12, 2009 at 11:02 AM, Ingo Molnar<mingo@xxxxxxx> wrote:
>>
>> > Not sure we want to change it. Mixing PID and CPU into the same
>> > space disallows the simultaneous application of both. I.e. right
>> > now we allow 3 models:
>> >
>> > Â- PID-ish
>> > Â- CPU-ish
>> > Â- PID and CPU [say measure CPU#2 component of an inherited workload.]
>>
>> How useful is that last model, especially why only one CPU?
>
> It's somewhat useful: say on an inherited workload one could 'cut
> out' just a single CPU worth of samples.
>
> Or a tool could implement a more scalable sampling model: say on a
> quad core CPU one could have four counters in an inherited workload:
>
> Âcycles:cpu0
> Âcycles:cpu1
> Âcycles:cpu2
> Âcycles:cpu3
>

What would you be able to do with that kind of information?
How would you modify the program to improve performance?

> ... and depending on which CPU a sub-process or sub-thread is
> running on, would the according (nicely per cpu local) sampling
> buffer be used.
>
>> > Also, i dont really see the use-cases for new targets. (i've
>> > seen a few mentioned but none seemed valid) What new targets do
>> > people have in mind?
>>
>> I seem to recall people mentioned:
>> Â Â1- CPU socket, e.g., uncore PMU
>> Â Â2- chipset
>> Â Â3- GPU
>>
>> I can see 1/ being indirectly achievable by specifying a CPU.
>
> Correct.
>
> ( Note, it's not just indirectly achievable as a side-effect - for
> Âexample the Intel uncore PMU has a target CPU irq-mask, so it
> Âmakes sense to allow the specification of the specific CPU we are
> Âmeasuring on as well. The physical socket maps from the CPU. )
>
I know that because I used it to support uncore on perfmon.
Some people have argued, though, that it could be interesting to
interrupt all cores at once on uncore overflow. That's one way to know
where all cores are at the same point in time. I wonder how you could
support that.


>> [...] But the others are uncorrelated to either a CPU or thread. I
>> have already seen requests for accessing chipsets, and seems GPU
>> are around the corner now.
>>
>> Why do you think those would be invalid targets given the goal of
>> this API?
>
> No.
>
> Chipset and GPU measurements are very much possible via perfcounters
> as well - but that does not require the touching of the pid,cpu
> target parameters to sys_perf_counter_open().
>
> I think the confusion in this discussion comes from the fact that
> there are two different types of 'targets':
>
> The first type of target, the <pid,cpu> target is a _scheduling_,
> task management abstraction. Adding a chipset ID or GPU ID to that
> makes little sense! Tasks dont get scheduled on a 'chipset' - to
> each task the chipset looks like an external entity.
>
> The second type of target is the 'event source itself'. (and it's
> not really a target but a source.)
>
> A chipset or GPU should be abstracted via an _event source_
> abstraction. We've got wide possibilities to do that, and we already
> abstract a fair amount of non-CPU-sourced events that way: say we
> have irq tracepoint counters:
>
> Âaldebaran:~> perf list 2>&1 | grep irq
>  Âirq:irq_handler_entry           Â[Tracepoint event]
>
> irqs come from the chipset, so in an (unintended) way perfcounters
> already instruments the chipset today.
>
> So yes, both chipset and GPU sampling is very much possible, and it
> does not require the tweaking of the syscall target parameters -
> each CPU has a typically symmetric view on it.
>

Except there can be many GPUs, I/O devices and other pieces of
hardware with PMU-like capabilities in a single system. In that case,
you need to be able to name them: I want to measure GPUcycles on
GPU0. When you are down at that level, you don't really care about
the CPU or thread. So what would you pass for those in that case?

> Note that there's overlap: a CPU can be an event source and a
> scheduling target as well. I think some of the confusion in
> terminology comes from that.
>
> To support chipset or GPU sampling, the perf_type_id and/or the
> struct perf_counter_attr space can be extended.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/