[PATCH] perf evsel: Correct clock unit to nanosecond

From: Leo Yan
Date: Fri Nov 30 2018 - 05:06:24 EST


Since commit 0aa802a79469 ("perf stat: Get rid of extra clock display
function"), the cpu and task clock unit has been changed from
nanosecond value to millisecond value. This introduces confusion for
CPU run time statistics, we can see in below flow the clock value is
scaled from nanosecond value to millisecond value; but this is
contradiction with statistics type 'STAT_NSECS', which always takes
clock as nanosecond unit.

perf_stat__update_shadow_stats()
`-> count *= counter->scale;
update_runtime_stat(st, STAT_NSECS, 0, cpu, count);

And when 'perf stat' calculates frequencies for cpu cycles and other
events, it uses event count to divide clock and it expects divisor unit
has nanosecond value, but actually the clock value has been scaled to
millisecond value. At the end we can exaggerate values in below output
result (e.g. 500M/sec for context-switches, cycle is 1079636.500 GHz).

# perf stat -- sleep 1

Performance counter stats for 'sleep 1':

2.29 msec task-clock # 0.002 CPUs utilized
1 context-switches # 500.000 M/sec
1 cpu-migrations # 500.000 M/sec
53 page-faults # 26500.000 M/sec
2,159,273 cycles # 1079636.500 GHz
1,281,566 instructions # 0.59 insn per cycle
171,153 branches # 85576500.000 M/sec
17,870 branch-misses # 10.44% of all branches

To fix this issue, this patch changes back clock unit to nanosecond
level, as result we can see the correct frequency calculation:

# perf stat -- sleep 1

Performance counter stats for 'sleep 1':

2,461,180 nsec task-clock # 0.002 CPUs utilized
1 context-switches # 0.406 K/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.022 M/sec
2,320,634 cycles # 0.943 GHz
1,284,442 instructions # 0.55 insn per cycle
171,425 branches # 69.652 M/sec
17,970 branch-misses # 10.48% of all branches

This patch has side effect for changing unit field for perf command.
This is tested with below two cases: one is for normal output format and
another is for cvs output format.

Before:

# perf stat -e cpu-clock,task-clock -C 0 sleep 3

Performance counter stats for 'CPU(s) 0':

3,003.50 msec cpu-clock # 1.000 CPUs utilized
3,003.50 msec task-clock # 1.000 CPUs utilized

3.003697880 seconds time elapsed

Now:

# perf stat -e cpu-clock,task-clock -C 0 sleep 3

Performance counter stats for 'CPU(s) 0':

3,003,684,880 nsec cpu-clock # 1.000 CPUs utilized
3,003,682,780 nsec task-clock # 1.000 CPUs utilized

3.003881982 seconds time elapsed

Before:

# perf stat -e cpu-clock,task-clock -C 0 -x, sleep 3
3003.64,msec,cpu-clock,3003644320,100.00,1.000,CPUs utilized
3003.64,msec,task-clock,3003643300,100.00,1.000,CPUs utilized

Now:

# perf stat -e cpu-clock,task-clock -C 0 -x, sleep 3
3003501460,nsec,cpu-clock,3003501660,100.00,1.000,CPUs utilized
3003499500,nsec,task-clock,3003499500,100.00,1.000,CPUs utilized

Cc: Daniel Thompson <daniel.thompson@xxxxxxxxxx>
Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>
---
tools/perf/util/evsel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index dbc0466..90a5381 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -267,10 +267,10 @@ struct perf_evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx)
* The evsel->unit points to static alias->unit
* so it's ok to use static string in here.
*/
- static const char *unit = "msec";
+ static const char *unit = "nsec";

evsel->unit = unit;
- evsel->scale = 1e-6;
+ evsel->scale = 1;
}

return evsel;
--
2.7.4