Re: [PATCH v6 00/10] tracing: 'hist' triggers

From: Tom Zanussi
Date: Thu Jun 04 2015 - 21:15:37 EST


On Fri, 2015-06-05 at 07:38 +0900, Masami Hiramatsu wrote:
> On 2015/06/04 3:30, Tom Zanussi wrote:
> > This is v6 of the 'hist triggers' patchset, following feedback from
> > v5.
> >
> > Changes from v5:
> >
> > This version adds support for compound keys, along with the related
> > ability to sort using primary and secondary keys. This was mentioned
> > in previous versions as the last important piece that remained
> > unimplemented, and is now implemented. (I didn't have time to get to
> > the couple of enhancements suggested by Masami, but I expect to be
> > able to add those later on top of these.)
>
> No problem, I agree that we can add such enhancement afterwards.
>
> > Because we now support compound keys and it's not immediately clear in
> > the output exactly which fields correspond to keys, the key(s),
> > compound or not, are now enclosed by curly braces.
>
> Yes, this looks very nice :)
> Here, I've tested with probes.
>
> ----
> [root@localhost tracing]# perf probe __kmalloc caller=\$stack0 size
> Added new event:
> probe:__kmalloc (on __kmalloc with caller=$stack0 size)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:__kmalloc -aR sleep 1
>
> [root@localhost tracing]# echo 'hist:keys=caller.sym,size:vals=hitcount:sort=size' > events/probe/__kmalloc/trigger
> [root@localhost tracing]# cat events/probe/__kmalloc/hist
> # trigger info: hist:keys=caller.sym,size:vals=hitcount:sort=size:size=2048 [active]
>
> { caller: [ffffffffa020fdb7] drm_mode_dirtyfb_ioctl , size: 8 } hitcount: 12
> { caller: [ffffffff812966c0] load_elf_binary , size: 28 } hitcount: 1
> { caller: [ffffffff8125bde4] alloc_fdmem , size: 64 } hitcount: 1
> { caller: [ffffffffa02e0770] __nf_ct_ext_add_length , size: 96 } hitcount: 1
> { caller: [ffffffff81415eb8] bio_alloc_bioset , size: 152 } hitcount: 1
> { caller: [ffffffff81296456] load_elf_phdrs , size: 392 } hitcount: 1
> { caller: [ffffffff81296456] load_elf_phdrs , size: 504 } hitcount: 1
> { caller: [ffffffff81528f18] __tty_buffer_request_room , size: 1056 } hitcount: 1
> { caller: [ffffffff8169ca3b] sk_prot_alloc , size: 1136 } hitcount: 1
> { caller: [ffffffff8125bde4] alloc_fdmem , size: 2048 } hitcount: 1
> { caller: [ffffffff8126275b] seq_buf_alloc , size: 4096 } hitcount: 6
> { caller: [ffffffff81196540] tracing_map_sort_entries , size: 16384 } hitcount: 1
>
> Totals:
> Hits: 28
> Entries: 12
> Dropped: 0
> ----
> BTW, when I pass 'keys' as a sort key, it just ignored and no error is returned.
> I think it would better return an error if parsing is failed.
>

Yeah, I think that makes more sense (the current behavior is to just use
hitcount in that case). I'll change it to do that.

Thanks for testing,

Tom

> Thank you!
>
>
>
> >
> > The Documentation and README have been updated to reflect the changes,
> > and several new examples have been added to illustrate how to use
> > compound keys.
> >
> > Also, the code was updated to work with the new ftrace_event_file,
> > etc, renaming in tracing/for-next.
> >
> > Changes from v4:
> >
> > This version addresses some problems and suggestions made by Daniel
> > Wagner - a lot of the code was reworked to get rid of the distinction
> > between keys and values, and as a result, both keys and values can be
> > used as sort keys. As suggested, it also allows 'val=' to be absent
> > in a trigger command - if no 'val' is specified, hitcount is assumed
> > and automatically used as the only val.
> >
> > The map code was also separated out into a separate file,
> > tracing_map.c, allowing it to be reused. It also adds a second tracer
> > called function_hist that actually does reuse the code, as an RFC
> > patch.
> >
> > Patch 01/10 [tracing: Update cond flag when enabling or disabling..]
> > is a fix for a problem noticed by Daniel and that fixes a problem in
> > existing trigger code and should be applied regardless of whether the
> > rest of the patchset is merged.
> >
> > As mentioned, patch 10/10 is an RFC patch implementing a new tracer
> > based on the function tracer code. It's a fun little tool and is
> > useful for a specific problem I'm working on (and is also a nice test
> > of the tracing_map code), but is an RFC because first, I'm not sure it
> > would really be of general interest and secondly, it's POC-level
> > quality and I'd need to spend more time fixing it up to make it
> > upstreamable, but I don't want to waste my time if not.
> >
> > There are a couple of important bits of functionality that were
> > present in v1 but not yet reimplemented in v5.
> >
> > The first is support for compound keys. Currently, maps can only be
> > keyed on a single event field, whereas in v1 they could be keyed on
> > multiple keys. With support for compound keys, you can create much
> > more interesting output, such as for example per-pid lists of
> > syscalls or read counts e.g.:
> >
> > # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \
> > /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
> >
> > # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
> >
> > key: common_pid:bash[3112], id:sys_write vals: count:69
> > key: common_pid:bash[3112], id:sys_rt_sigprocmask vals: count:218
> >
> > key: common_pid:update-notifier[3164], id:sys_poll vals: count:37
> > key: common_pid:update-notifier[3164], id:sys_recvfrom vals: count:118
> >
> > key: common_pid:deja-dup-monito[3194], id:sys_sendto vals: count:1
> > key: common_pid:deja-dup-monito[3194], id:sys_read vals: count:4
> > key: common_pid:deja-dup-monito[3194], id:sys_poll vals: count:8
> > key: common_pid:deja-dup-monito[3194], id:sys_recvmsg vals: count:8
> > key: common_pid:deja-dup-monito[3194], id:sys_getegid vals: count:8
> >
> > key: common_pid:emacs[3275], id:sys_fsync vals: count:1
> > key: common_pid:emacs[3275], id:sys_open vals: count:1
> > key: common_pid:emacs[3275], id:sys_symlink vals: count:2
> > key: common_pid:emacs[3275], id:sys_poll vals: count:23
> > key: common_pid:emacs[3275], id:sys_select vals: count:23
> > key: common_pid:emacs[3275], id:unknown_syscall vals: count:34
> > key: common_pid:emacs[3275], id:sys_ioctl vals: count:60
> > key: common_pid:emacs[3275], id:sys_rt_sigprocmask vals: count:116
> >
> > key: common_pid:cat[3323], id:sys_munmap vals: count:1
> > key: common_pid:cat[3323], id:sys_fadvise64 vals: count:1
> >
> > Related to that is support for sorting on multiple fields. Currently,
> > you can sort using only a primary key. Being able to sort on multiple
> > or at least a secondary key is indispensible for seeing trends when
> > displaying multiple values.
> >
> > Changes from v3:
> >
> > v4 fixes the race in tracing_map_insert() noted in v3, where
> > map.val.key could be checked even if map.val wasn't yet set. The
> > simple fix for that in tracing_map_insert() introduces the possibility
> > of duplicates in the map, which though rare, need to be accounted for
> > in the output. To address that, duplicate-merging code was added to
> > the map-printing code.
> >
> > It was also pointed out that it didn't seem correct to include
> > module.h, but the fix for that has deeper roots and is being addressed
> > by a separate patchset; for now we need to continue including
> > module.h, though prompted by that I did some other header include
> > cleanup.
> >
> > The functionality remains the same as v2, but this version no longer
> > tries to export and use bpf_maps, and more importantly removes the
> > associated GFP_NOTRACE/trace event hacks and kmem macros required to
> > work around the bpf_map implementation.
> >
> > The tracing_map functionality is instead built on top of a simple
> > lock-free map algorithm originated by Dr. Cliff Click (see references
> > in the code for more details), which though too restrictive to be
> > general-purpose in its current form, functions nicely as a
> > special-purpose tracing map.
> >
> > v3 also moves the hist triggers code into a separate file and puts it
> > all behind a new config option, CONFIG_HIST_TRIGGERS. It also merges
> > in the sorting code rather than keeping it as a separate patch.
> >
> > This patchset also includes a couple other new and related triggers,
> > enable_hist and disable_hist, very similar to the existing
> > enable_event/disable_event triggers used to automatically enable and
> > disable events based on a triggering condition, but in this case
> > allowing hist triggers to be enabled and disabled in the same way.
> >
> > - Added an insert check for val before checking the key associated with val
> > - Added code to merge possible duplicates in the map
> >
> > Changes from v2:
> > - reimplemented tracing_map, replacing bpf_map with nmi-safe/lock-free map
> > - removed GPF_NOTRACE, kmalloc/free macros and event hacks needed by bpf_maps
> > - moved hist triggers from trace_events_trigger.c to trace_events_hist.c
> > - added CONFIG_HIST_TRIGGERS config option
> > - consolidated sorting code with main patch
> >
> > Changes from v1:
> > - completely rewritten on top of tracing_map (renamed and exported bpf_map)
> > - added map clearing and client ops to tracing_map
> > - changed the name from 'hash' triggers to 'hist' triggers
> > - added new trigger 'pause' feature
> > - added new enable_hist and disable_hist triggers
> > - added usage for hist/enable_hist/disable hist to tracing/README
> > - moved examples into Documentation/trace/event.txt
> > - added ___GFP_NOTRACE, kmalloc/kfree macros, and conditional kmem tracepoints
> >
> > The following changes since commit a497adb45b8691f7e477e711a1a4bd54748d64fe:
> >
> > ring-buffer: Add enum names for the context levels (2015-05-29 10:39:08 -0400)
> >
> > are available in the git repository at:
> >
> > git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hist-triggers-v6
> > http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hist-triggers-v6
> >
> > Tom Zanussi (10):
> > tracing: Update cond flag when enabling or disabling a trigger
> > tracing: Make ftrace_event_field checking functions available
> > tracing: Add event record param to trigger_ops.func()
> > tracing: Add get_syscall_name()
> > tracing: Add a per-event-trigger 'paused' field
> > trace: Add lock-free tracing_map
> > tracing: Add 'hist' event trigger command
> > tracing: Add enable_hist/disable_hist triggers
> > tracing: Add 'hist' trigger Documentation
> > ftrace: Add function_hist tracer
> >
> > Documentation/trace/events.txt | 1091 ++++++++++++++++++++++++++++
> > include/linux/trace_events.h | 9 +-
> > kernel/trace/Kconfig | 14 +
> > kernel/trace/Makefile | 3 +
> > kernel/trace/trace.c | 68 ++
> > kernel/trace/trace.h | 88 ++-
> > kernel/trace/trace_events.c | 4 +
> > kernel/trace/trace_events_filter.c | 12 -
> > kernel/trace/trace_events_hist.c | 1360 +++++++++++++++++++++++++++++++++++
> > kernel/trace/trace_events_trigger.c | 149 ++--
> > kernel/trace/trace_functions_hist.c | 345 +++++++++
> > kernel/trace/trace_syscalls.c | 11 +
> > kernel/trace/tracing_map.c | 920 ++++++++++++++++++++++++
> > kernel/trace/tracing_map.h | 143 ++++
> > 14 files changed, 4133 insertions(+), 84 deletions(-)
> > create mode 100644 kernel/trace/trace_events_hist.c
> > create mode 100644 kernel/trace/trace_functions_hist.c
> > create mode 100644 kernel/trace/tracing_map.c
> > create mode 100644 kernel/trace/tracing_map.h
> >
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/