Re: [PATCH] perf evsel: Correct clock unit to nanosecond

From: Jiri Olsa
Date: Fri Nov 30 2018 - 05:22:00 EST


On Fri, Nov 30, 2018 at 06:06:05PM +0800, Leo Yan wrote:
> Since commit 0aa802a79469 ("perf stat: Get rid of extra clock display
> function"), the cpu and task clock unit has been changed from
> nanosecond value to millisecond value. This introduces confusion for
> CPU run time statistics, we can see in below flow the clock value is
> scaled from nanosecond value to millisecond value; but this is
> contradiction with statistics type 'STAT_NSECS', which always takes
> clock as nanosecond unit.
>
> perf_stat__update_shadow_stats()
> `-> count *= counter->scale;
> update_runtime_stat(st, STAT_NSECS, 0, cpu, count);
>
> And when 'perf stat' calculates frequencies for cpu cycles and other
> events, it uses event count to divide clock and it expects divisor unit
> has nanosecond value, but actually the clock value has been scaled to
> millisecond value. At the end we can exaggerate values in below output
> result (e.g. 500M/sec for context-switches, cycle is 1079636.500 GHz).
>
> # perf stat -- sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 2.29 msec task-clock # 0.002 CPUs utilized
> 1 context-switches # 500.000 M/sec
> 1 cpu-migrations # 500.000 M/sec
> 53 page-faults # 26500.000 M/sec
> 2,159,273 cycles # 1079636.500 GHz
> 1,281,566 instructions # 0.59 insn per cycle
> 171,153 branches # 85576500.000 M/sec
> 17,870 branch-misses # 10.44% of all branches
>
> To fix this issue, this patch changes back clock unit to nanosecond
> level, as result we can see the correct frequency calculation:
>
> # perf stat -- sleep 1
>
> Performance counter stats for 'sleep 1':
>
> 2,461,180 nsec task-clock # 0.002 CPUs utilized
> 1 context-switches # 0.406 K/sec
> 0 cpu-migrations # 0.000 K/sec
> 54 page-faults # 0.022 M/sec
> 2,320,634 cycles # 0.943 GHz
> 1,284,442 instructions # 0.55 insn per cycle
> 171,425 branches # 69.652 M/sec
> 17,970 branch-misses # 10.48% of all branches
>
> This patch has side effect for changing unit field for perf command.
> This is tested with below two cases: one is for normal output format and
> another is for cvs output format.
>
> Before:
>
> # perf stat -e cpu-clock,task-clock -C 0 sleep 3
>
> Performance counter stats for 'CPU(s) 0':
>
> 3,003.50 msec cpu-clock # 1.000 CPUs utilized
> 3,003.50 msec task-clock # 1.000 CPUs utilized
>
> 3.003697880 seconds time elapsed
>
> Now:
>
> # perf stat -e cpu-clock,task-clock -C 0 sleep 3
>
> Performance counter stats for 'CPU(s) 0':
>
> 3,003,684,880 nsec cpu-clock # 1.000 CPUs utilized
> 3,003,682,780 nsec task-clock # 1.000 CPUs utilized
>
> 3.003881982 seconds time elapsed
>
> Before:
>
> # perf stat -e cpu-clock,task-clock -C 0 -x, sleep 3
> 3003.64,msec,cpu-clock,3003644320,100.00,1.000,CPUs utilized
> 3003.64,msec,task-clock,3003643300,100.00,1.000,CPUs utilized
>
> Now:
>
> # perf stat -e cpu-clock,task-clock -C 0 -x, sleep 3
> 3003501460,nsec,cpu-clock,3003501660,100.00,1.000,CPUs utilized
> 3003499500,nsec,task-clock,3003499500,100.00,1.000,CPUs utilized
>
> Cc: Daniel Thompson <daniel.thompson@xxxxxxxxxx>
> Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>

there was fix for this recently, could you please check
if it's working for you:
6e269c85dcea perf stat: Fix shadow stats for clock events

thanks,
jirka