Re: perf stat output

From: Peter Zijlstra
Date: Fri Nov 13 2009 - 03:07:50 EST


On Thu, 2009-11-12 at 20:03 -0200, Lucas De Marchi wrote:
> Hi all!
>
> Some questions about perf stat output. See example:
>
>
> lucas@LMS-linux:~/programming/testprograms> perf stat -e
> L1-dcache-loads -e L1-dcache-load-misses -- make -j
> gcc test_schedchanges.c -o test_schedchanges
> gcc -pthread test_taskaff1.c -o test_taskaff1
> gcc -pthread test_taskaff2.c -o test_taskaff2
> gcc -pthread test_taskaff3.c -o test_taskaff3
>
> Performance counter stats for 'make -j':
>
> 161384667 L1-dcache-loads # 0.000 M/sec
> 24853791 L1-dcache-load-misses # 0.000 M/sec
>
> 0.066893389 seconds time elapsed
>
> Why do we have both L1-dcache-loads and L1-dcache-load-misses with
> 0.000 M/sec? Also, why do we have 0 M/s when running "perf stat -a -e
> cache-misses -e cache-references" but values different than 0 when
> running "perf stat -a" without selecting the events?

No idea, you'd have to look at the code computing this M/sec stuff. I
think Ingo wrote that, so he might have an idea.

> The last question: what does the "scaled from X%" mean? Is it related
> to the maximum number of performance registers a processor can count
> at a time?

Yes, if the hardware has only 2 counters and you specify 4, we'll
round-robin those 4 onto the 2. In that case you'll see things like
scaled from ~50% because each counter will only have been on the actual
PMU for about 50% of the time.

(RR happens with tick granularity, so if your runtime is of that order
or shorter you can get funny results with some counters being 0).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/