Re: [PATCH v2 00/15] tracing: 'hist' triggers

From: Tom Zanussi
Date: Tue Mar 03 2015 - 09:48:01 EST


On Tue, 2015-03-03 at 11:25 +0900, Masami Hiramatsu wrote:
> (2015/03/03 1:00), Tom Zanussi wrote:
> > This is v2 of my previously posted 'hashtriggers' patchset [1], but
> > renamed to 'hist triggers' following feedback from v1.
>
> This is what I need :) The trigger interface gives us better flexibility
> for environment. With this series I believe the 80% use of "scripting
> tracing" can be replaced with just "echo'ing tracing" via tracefs :)
>

Glad you like it, thanks!

> >
> > Since then, the kernel has gained a tracing map implementation in the
> > form of bpf_map, which this patchset makes a bit more generic, exports
> > and uses (as tracing_map_*, still in the bpf syscall file however).
> >
> > A large part of the initial hash triggers implementation was devoted
> > to a map implementation and general-purpose hashing functions, which
> > have now been subsumed by the bpf maps. I've completely redone the
> > trigger patches themselves to work on top of tracing_map. The result
> > is a much simpler and easier-to-review patchset that's able to focus
> > more directly on the problem at hand.
> >
> > The new version addresses all the comments from the previous review,
> > including changing the name from hash->hist, adding separate 'hist'
> > files for the output, and moving the examples into Documentation.
> >
> > This patchset also includes a couple other new and related triggers,
> > enable_hist and disable_hist, very similar to the existing
> > enable_event/disable_event triggers used to automatically enable and
> > disable events based on a triggering condition, but in this case
> > allowing hist triggers to be enabled and disabled in the same way.
> >
> > The only problem with using the bpf_map implementation for this is
> > that it uses kmalloc internally, which causes problems when trying to
> > trace kmalloc itself. I'm guessing the ebpf tracing code would also
> > share this problem e.g. when using bpf_maps from probes on kmalloc().
> > This patchset attempts a solution to that problem (by adding a
> > gfp_flag and changing the kmem memory allocation tracepoints to
> > conditional variants) for checking for it in for but I'm not sure it's
> > the best way to address it.
>
> That is not a solution for kprobe-based events, nor the events on
> interrupt context.
> Can we reserve some amount of memory for bpf_map? and If it is exceeded
> the reserved memory we can choose (A) disable hist or (B) continue
> to do with kmalloc.
>

Yeah, the non-bpf_map v1 did (A) with reserved memory. I'll take a look
at doing that again for the next version.

> >
> > There are a couple of important bits of functionality that were
> > present in v1 but dropped in v2 mainly because I'm still trying to
> > figure out the best way to accomplish those things using the bpf_map
> > implementation.
> >
> > The first is support for compound keys. Currently, maps can only be
> > keyed on a single event field, whereas in v1 they could be keyed on
> > multiple keys. With support for compound keys, you can create much
> > more interesting output, such as for example per-pid lists of
> > syscalls or read counts e.g.:
> >
> > # echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount' > \
> > /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
> >
> > # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
> >
> > key: common_pid:bash[3112], id:sys_write vals: count:69
> > key: common_pid:bash[3112], id:sys_rt_sigprocmask vals: count:218
> >
> > key: common_pid:update-notifier[3164], id:sys_poll vals: count:37
> > key: common_pid:update-notifier[3164], id:sys_recvfrom vals: count:118
> >
> > key: common_pid:deja-dup-monito[3194], id:sys_sendto vals: count:1
> > key: common_pid:deja-dup-monito[3194], id:sys_read vals: count:4
> > key: common_pid:deja-dup-monito[3194], id:sys_poll vals: count:8
> > key: common_pid:deja-dup-monito[3194], id:sys_recvmsg vals: count:8
> > key: common_pid:deja-dup-monito[3194], id:sys_getegid vals: count:8
> >
> > key: common_pid:emacs[3275], id:sys_fsync vals: count:1
> > key: common_pid:emacs[3275], id:sys_open vals: count:1
> > key: common_pid:emacs[3275], id:sys_symlink vals: count:2
> > key: common_pid:emacs[3275], id:sys_poll vals: count:23
> > key: common_pid:emacs[3275], id:sys_select vals: count:23
> > key: common_pid:emacs[3275], id:unknown_syscall vals: count:34
> > key: common_pid:emacs[3275], id:sys_ioctl vals: count:60
> > key: common_pid:emacs[3275], id:sys_rt_sigprocmask vals: count:116
> >
> > key: common_pid:cat[3323], id:sys_munmap vals: count:1
> > key: common_pid:cat[3323], id:sys_fadvise64 vals: count:1
>
> Very impressive! :)
>

Thanks!

Tom

> Thank you,
>
> >
> > Related to that is support for sorting on multiple fields. Currently,
> > you can sort using only a primary key. Being able to sort on multiple
> > or at least a secondary key is indispensible for seeing trends when
> > displaying multiple values.
> >
> > [1] http://thread.gmane.org/gmane.linux.kernel/1673551
> >
> > Changes from v1:
> > - completely rewritten on top of tracing_map (renamed and exported bpf_map)
> > - added map clearing and client ops to tracing_map
> > - changed the name from 'hash' triggers to 'hist' triggers
> > - added new trigger 'pause' feature
> > - added new enable_hist and disable_hist triggers
> > - added usage for hist/enable_hist/disable hist to tracing/README
> > - moved examples into Documentation/trace/event.txt
> > - added ___GFP_NOTRACE, kmalloc/kfree macros, and conditional kmem tracepoints
> >
> > The following changes since commit 49058038a12cfd9044146a1bf4b286781268d5c9:
> >
> > ring-buffer: Do not wake up a splice waiter when page is not full (2015-02-24 14:00:41 -0600)
> >
> > are available in the git repository at:
> >
> > git://git.yoctoproject.org/linux-yocto-contrib.git tzanussi/hist-triggers-v2
> > http://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto-contrib/log/?h=tzanussi/hist-triggers-v2
> >
> > Tom Zanussi (15):
> > tracing: Make ftrace_event_field checking functions available
> > tracing: Add event record param to trigger_ops.func()
> > tracing: Add get_syscall_name()
> > bpf: Export bpf map functionality as trace_map_*
> > bpf: Export a map-clearing function
> > bpf: Add tracing_map client ops
> > mm: Add ___GFP_NOTRACE
> > tracing: Make kmem memory allocation tracepoints conditional
> > tracing: Add kmalloc/kfree macros
> > bpf: Make tracing_map use kmalloc/kfree_notrace()
> > tracing: Add a per-event-trigger 'paused' field
> > tracing: Add 'hist' event trigger command
> > tracing: Add sorting to hist triggers
> > tracing: Add enable_hist/disable_hist triggers
> > tracing: Add 'hist' trigger Documentation
> >
> > Documentation/trace/events.txt | 870 +++++++++++++++++++++
> > include/linux/bpf.h | 15 +
> > include/linux/ftrace_event.h | 9 +-
> > include/linux/gfp.h | 3 +-
> > include/linux/slab.h | 61 +-
> > include/trace/events/kmem.h | 28 +-
> > kernel/bpf/arraymap.c | 16 +
> > kernel/bpf/hashtab.c | 39 +-
> > kernel/bpf/syscall.c | 193 ++++-
> > kernel/trace/trace.c | 48 ++
> > kernel/trace/trace.h | 25 +-
> > kernel/trace/trace_events.c | 3 +
> > kernel/trace/trace_events_filter.c | 15 +-
> > kernel/trace/trace_events_trigger.c | 1466 ++++++++++++++++++++++++++++++++++-
> > kernel/trace/trace_syscalls.c | 11 +
> > mm/slab.c | 45 +-
> > mm/slob.c | 45 +-
> > mm/slub.c | 47 +-
> > 18 files changed, 2795 insertions(+), 144 deletions(-)
> >
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/