Re: [PATCHSET 00/12] perf tools: Apply percent-limit to callchains

From: Namhyung Kim
Date: Tue Jan 26 2016 - 09:50:21 EST


On Tue, Jan 26, 2016 at 03:41:35PM +0100, Jiri Olsa wrote:
> On Tue, Jan 26, 2016 at 11:10:25PM +0900, Namhyung Kim wrote:
> > On Tue, Jan 26, 2016 at 02:27:26PM +0100, Jiri Olsa wrote:
> > > On Tue, Jan 26, 2016 at 09:51:59PM +0900, Namhyung Kim wrote:
> > > > On Tue, Jan 26, 2016 at 01:14:47PM +0100, Jiri Olsa wrote:
> > > > > On Sun, Jan 24, 2016 at 10:53:23PM +0900, Namhyung Kim wrote:
> > > > > > Hello,
> > > > > >
> > > > > > This patchset tries to implement percent limit to callchains which was
> > > > > > requested by Andi Kleen. For some reason, limiting callchains by
> > > > > > (overhead) percentage didn't work well. This patch fixes it and make
> > > > > > --percent-limit also works for callchains as well as hist entries.
> > > > > >
> > > > > > This is available on 'perf/callchain-limit-v1' branch in my tree:
> > > > > >
> > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> > > > > >
> > > > > > Any comments are welcome,
> > > > > >
> > > > > > Thanks,
> > > > > > Namhyung
> > > > > >
> > > > > >
> > > > > > Namhyung Kim (12):
> > > > > > perf report: Apply --percent-limit to callchains also
> > > > > > perf report: Apply callchain percent limit on --stdio
> > > > > > perf report: Get rid of hist_entry__callchain_fprintf()
> > > > > > perf report: Fix percent calculation on --stdio
> > > > > > perf report: Hide output pipe for percent-limited callchains on stdio
> > > > > > perf hists browser: Fix dump to show correct callchain style
> > > > > > perf hists browser: Fix callchain_node__count_rows()
> > > > > > perf hists browser: Apply callchain percent limit
> > > > > > perf hists browser: Fix callchain counting when press ENTER key
> > > > > > perf hists browser: Fix counting callchains when expand/collapse all
> > > > > > perf hists browser: Update percent base for fractal callchain mode
> > > > > > perf report: Fix callchain percent limit on --gtk
> > > > >
> > > > > is 0.5 the default or one has to use the --percent-limit 0.5
> > > > > for the limit to be effective?
> > > >
> > > > Yes, it's effective now. I also think we need to change the default
> > > > limit of 0.5. It was set for 'fractal' mode initially AFAIK so its
> > > > percentage is relative to each node. In this case 0.5% of limit makes
> > > > sense because it'll be a very small (absolute) value.
> > > >
> > > > But With 'graph' mode (now default), there're many entries under 0.5
> > > > overhead and they silently won't show callchains anymore. Actually I
> > > > was confused by it when working with this patchset.
> > > >
> > > > What about 0.005% for the new default?
> > > >
> > > >
> > > > >
> > > > > without the option I'm getting empty callchains that are below 0.5
> > > > > but only in TUI mode (attached).. --stdio shows them all unfolded
> > > >
> > > > It should not show them all. But I found that I missed a check for
> > > > a stdio case. Could you please test below?
> > >
> > > did not help, it's still there.. same output as before
> >
> > Hmm.. strange, could you show me the (part of) stdio output?
> >
>
> yea, that one changed as well.. no callchains now, attached
>
>
> jirka
>
>
> ---
> [jolsa@krava perf]$ ./perf report --stdio
>
> ...
>
> 46.69% 46.69% ls [kernel.vmlinux] [k] intel_bts_enable_local
> |
> ---0x1000
> __statfs
> entry_SYSCALL_64_fastpath
> sys_statfs
> SYSC_statfs
> user_statfs
> user_path_at_empty
> filename_lookup
> path_lookupat
> link_path_walk
> inode_permission
> __inode_permission
> kernfs_iop_permission
> kernfs_refresh_inode
> security_inode_notifysecctx
> selinux_inode_notifysecctx
> selinux_inode_setsecurity
> security_context_to_sid
> security_context_to_sid_core
> string_to_context_struct
> hashtab_search
> apic_timer_interrupt
> smp_apic_timer_interrupt
> local_apic_timer_interrupt
> hrtimer_interrupt
> __hrtimer_run_queues
> tick_sched_timer
> tick_sched_handle.isra.17
> update_process_times
> scheduler_tick
> perf_event_task_tick
> perf_pmu_enable.part.87
> x86_pmu_enable
> intel_pmu_enable_all
> intel_bts_enable_local
>
> 0.08% 0.00% perf [kernel.vmlinux] [k] perf_pmu_enable.part.87
> 0.08% 0.00% perf [kernel.vmlinux] [k] perf_event_context_sched_in
> 0.08% 0.00% perf [kernel.vmlinux] [k] perf_event_exec
> 0.08% 0.00% perf [kernel.vmlinux] [k] setup_new_exec
> 0.08% 0.00% perf [kernel.vmlinux] [k] load_elf_binary
> 0.08% 0.00% perf [kernel.vmlinux] [k] search_binary_handler
> 0.08% 0.00% perf [kernel.vmlinux] [k] do_execveat_common.isra.33
> 0.08% 0.00% perf [kernel.vmlinux] [k] sys_execve
> 0.08% 0.00% perf [kernel.vmlinux] [k] return_from_execve
> 0.08% 0.00% perf [unknown] [k] 0x00007f2175b35e07
> 0.04% 0.00% perf [kernel.vmlinux] [k] perf_event_nmi_handler
> 0.04% 0.00% perf [kernel.vmlinux] [k] nmi_handle
> 0.04% 0.00% perf [kernel.vmlinux] [k] default_do_nmi
> 0.04% 0.00% perf [kernel.vmlinux] [k] do_nmi
> 0.04% 0.00% perf [kernel.vmlinux] [k] end_repeat_nmi
> 0.04% 0.04% perf [kernel.vmlinux] [k] x86_pmu_enable
> 0.04% 0.04% perf [kernel.vmlinux] [k] native_apic_mem_write
>

What's the problem? Now by default callchains under 0.5% (absolute)
will not be shown. I think this is intended output, and we need to
consider changing the default percent limit.

Thanks,
Namhyung