[PATCHSET 0/6] perf sched timehist: Introduce --idle-hist option (v2)

From: Namhyung Kim
Date: Thu Dec 08 2016 - 09:49:13 EST


Hi,

This patchset implements the idle hist feature which analyze reason of system
idle. Sometimes I need to investigate what makes CPUs to go idle even though
I have jobs to do. It may be due to I/O, waiting on lock or whatever.

To identify the reasons it only accounts events related to idle task. Also it
shows callchains when entering to idle and time how long it's in the idle.
Although it's not perfect, it works well to help finding the reasons.

With --idle-hist option, only idle task's runtime is accounted to previous
task (which makes the cpu go to idle). With --summary(-only) option, you can
see the total idle stat by task or callchains.

* v2 changes
- check prev_pid instead of sample->pid for idle task (due to race in exit path)
- factor idle task initialization (David)

The example output looks like below:

Idle-time summary
comm parent sched-out idle-time min-idle avg-idle max-idle stddev migrations
(count) (msec) (msec) (msec) (msec) %
---------------------------------------------------------------------------------------------------------------------
rcu_preempt[7] 2 95 550.872 0.011 5.798 23.146 7.63 0
migration/1[16] 2 1 15.558 15.558 15.558 15.558 0.00 0
khugepaged[39] 2 1 3.062 3.062 3.062 3.062 0.00 0
kworker/0:1H[124] 2 2 4.728 0.611 2.364 4.116 74.12 0
systemd-journal[167] 1 1 4.510 4.510 4.510 4.510 0.00 0
kworker/u16:3[558] 2 13 74.737 0.080 5.749 12.960 21.96 0
irq/34-iwlwifi[628] 2 21 118.403 0.032 5.638 23.990 24.00 0
kworker/u17:0[673] 2 1 3.523 3.523 3.523 3.523 0.00 0
dbus-daemon[722] 1 1 6.743 6.743 6.743 6.743 0.00 0
ifplugd[741] 1 1 58.826 58.826 58.826 58.826 0.00 0
wpa_supplicant[1490] 1 1 13.302 13.302 13.302 13.302 0.00 0
wpa_actiond[1492] 1 2 4.064 0.168 2.032 3.896 91.72 0
dockerd[1500] 1 1 0.055 0.055 0.055 0.055 0.00 0
dockerd[1516/1500] 1500 1 5.732 5.732 5.732 5.732 0.00 0
dockerd[1571/1500] 1500 1 0.020 0.020 0.020 0.020 0.00 0
...

Idle stats:
CPU 0 idle for 902.195 msec
CPU 1 idle for 899.932 msec
CPU 2 idle for 941.218 msec
CPU 3 idle for 822.453 msec

Idle stats by callchain:
CPU 0: 902.195 msec
Idle time (msec) Count Callchains
---------------- -------- --------------------------------------------------
370.589 69 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- entry_SYSCALL_64_fastpath
178.799 17 worker_thread <- kthread <- ret_from_fork
128.352 17 schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork
125.111 19 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_select <- core_sys_select
71.599 50 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll
23.146 1 rcu_gp_kthread <- kthread <- ret_from_fork
4.510 1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- do_syscall_64
0.085 1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- do_restart_poll
...


The code is available at 'perf/timehist-idle-v2' branch in my tree:

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome!

Thanks,
Namhyung


Namhyung Kim (6):
perf sched timehist: Split is_idle_sample()
perf sched timehist: Introduce struct idle_time_data
perf sched timehist: Save callchain when entering idle
perf sched timehist: Skip non-idle events when necessary
perf sched timehist: Add -I/--idle-hist option
perf sched timehist: Show callchains for idle stat

tools/perf/Documentation/perf-sched.txt | 4 +
tools/perf/builtin-sched.c | 252 +++++++++++++++++++++++++++-----
2 files changed, 222 insertions(+), 34 deletions(-)

--
2.10.1