[tip:perf/urgent] perf sched timehist: Fix invalid period calculation

From: tip-bot for Namhyung Kim
Date: Fri Dec 23 2016 - 14:29:54 EST


Commit-ID: bdd75729e5d279d734e8d3fb41ef4818ac1598ab
Gitweb: http://git.kernel.org/tip/bdd75729e5d279d734e8d3fb41ef4818ac1598ab
Author: Namhyung Kim <namhyung@xxxxxxxxxx>
AuthorDate: Thu, 22 Dec 2016 15:03:49 +0900
Committer: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
CommitDate: Thu, 22 Dec 2016 16:35:46 -0300

perf sched timehist: Fix invalid period calculation

When --time option is given with a value outside recorded time, the last
sample time (tprev) was set to that value and run time calculation might
be incorrect. This is a problem of the first samples for each cpus
since it would skip the runtime update when tprev is 0. But with --time
option it had non-zero (which is invalid) value so the calculation is
also incorrect.

For example, let's see the followging:

$ perf sched timehist
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
3195.968367 [0003] <idle> 0.000 0.000 0.000
3195.968386 [0002] Timer[4306/4277] 0.000 0.000 0.018
3195.968397 [0002] Web Content[4277] 0.000 0.000 0.000
3195.968595 [0001] JS Helper[4302/4277] 0.000 0.000 0.000
3195.969217 [0000] <idle> 0.000 0.000 0.621
3195.969251 [0001] kworker/1:1H[291] 0.000 0.000 0.033

The sample starts at 3195.968367 but when I gave a time interval from
3194 to 3196 (in sec) it will calculate the whole 2 second as runtime.
In below, 2 cpus accounted it as runtime, other 2 cpus accounted it as
idle time.

Before:

$ perf sched timehist --time 3194,3196 -s | tail
Idle stats:
CPU 0 idle for 1995.991 msec
CPU 1 idle for 20.793 msec
CPU 2 idle for 30.191 msec
CPU 3 idle for 1999.852 msec

Total number of unique tasks: 23
Total number of context switches: 128
Total run time (msec): 3724.940

After:

$ perf sched timehist --time 3194,3196 -s | tail
Idle stats:
CPU 0 idle for 10.811 msec
CPU 1 idle for 20.793 msec
CPU 2 idle for 30.191 msec
CPU 3 idle for 18.337 msec

Total number of unique tasks: 23
Total number of context switches: 128
Total run time (msec): 18.139

Committer notes:

Further testing:

Before:

Idle stats:
CPU 0 idle for 229.785 msec
CPU 1 idle for 937.944 msec
CPU 2 idle for 188.931 msec
CPU 3 idle for 986.185 msec

After:

# perf sched timehist --time 40602,40603 -s | tail

Idle stats:
CPU 0 idle for 229.785 msec
CPU 1 idle for 175.407 msec
CPU 2 idle for 188.931 msec
CPU 3 idle for 223.657 msec

Total number of unique tasks: 68
Total number of context switches: 814
Total run time (msec): 97.688

# for cpu in `seq 0 3` ; do echo -n "CPU $cpu idle for " ; perf sched timehist --time 40602,40603 | grep "\[000${cpu}\].*\<idle\>" | tr -s ' ' | cut -d' ' -f7 | awk '{entries++ ; s+=$1} END {print s " msec (entries: " entries ")"}' ; done
CPU 0 idle for 229.721 msec (entries: 123)
CPU 1 idle for 175.381 msec (entries: 65)
CPU 2 idle for 188.903 msec (entries: 56)
CPU 3 idle for 223.61 msec (entries: 102)

Difference due to the idle stats being accounted at nanoseconds precision while
the <idle> entries in 'perf sched timehist' are trucated at msec.usec.

Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
Tested-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: David Ahern <dsahern@xxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Fixes: 853b74071110 ("perf sched timehist: Add option to specify time window of interest")
Link: http://lkml.kernel.org/r/20161222060350.17655-2-namhyung@xxxxxxxxxx
Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
---
tools/perf/builtin-sched.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5052caa..d53e706 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2405,7 +2405,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
if (ptime->start && ptime->start > t)
goto out;

- if (ptime->start > tprev)
+ if (tprev && ptime->start > tprev)
tprev = ptime->start;

/*