Re: [PATCH V2 1/1 (was 0/1 by accident)] tools/dtrace: initial implementation of DTrace
From: Daniel Borkmann
Date: Wed Jul 10 2019 - 15:32:34 EST
Hello Kris,
On 07/10/2019 08:12 PM, Kris Van Hees wrote:
> This patch's subject should of course be [PATCH V2 1/1] rather than 0/1.
> Sorry about that.
>
> On Wed, Jul 10, 2019 at 08:42:24AM -0700, Kris Van Hees wrote:
>> This initial implementation of a tiny subset of DTrace functionality
>> provides the following options:
>>
>> dtrace [-lvV] [-b bufsz] -s script
>> -b set trace buffer size
>> -l list probes (only works with '-s script' for now)
>> -s enable or list probes for the specified BPF program
>> -V report DTrace API version
>>
>> The patch comprises quite a bit of code due to DTrace requiring a few
>> crucial components, even in its most basic form.
>>
>> The code is structured around the command line interface implemented in
>> dtrace.c. It provides option parsing and drives the three modes of
>> operation that are currently implemented:
>>
>> 1. Report DTrace API version information.
>> Report the version information and terminate.
>>
>> 2. List probes in BPF programs.
>> Initialize the list of probes that DTrace recognizes, load BPF
>> programs, parse all BPF ELF section names, resolve them into
>> known probes, and emit the probe names. Then terminate.
>>
>> 3. Load BPF programs and collect tracing data.
>> Initialize the list of probes that DTrace recognizes, load BPF
>> programs and attach them to their corresponding probes, set up
>> perf event output buffers, and start processing tracing data.
>>
>> This implementation makes extensive use of BPF (handled by dt_bpf.c) and
>> the perf event output ring buffer (handled by dt_buffer.c). DTrace-style
>> probe handling (dt_probe.c) offers an interface to probes that hides the
>> implementation details of the individual probe types by provider (dt_fbt.c
>> and dt_syscall.c). Probe lookup by name uses a hashtable implementation
>> (dt_hash.c). The dt_utils.c code populates a list of online CPU ids, so
>> we know what CPUs we can obtain tracing data from.
>>
>> Building the tool is trivial because its only dependency (libbpf) is in
>> the kernel tree under tools/lib/bpf. A simple 'make' in the tools/dtrace
>> directory suffices.
>>
>> The 'dtrace' executable needs to run as root because BPF programs cannot
>> be loaded by non-root users.
>>
>> Signed-off-by: Kris Van Hees <kris.van.hees@xxxxxxxxxx>
>> Reviewed-by: David Mc Lean <david.mclean@xxxxxxxxxx>
>> Reviewed-by: Eugene Loh <eugene.loh@xxxxxxxxxx>
>> ---
>> Changes in v2:
>> - Use ring_buffer_read_head() and ring_buffer_write_tail() to
>> avoid use of volatile.
>> - Handle perf events that wrap around the ring buffer boundary.
>> - Remove unnecessary PERF_EVENT_IOC_ENABLE.
>> - Remove -I$(srctree)/tools/perf from KBUILD_HOSTCFLAGS since it
>> is not actually used.
>> - Use PT_REGS_PARM1(x), etc instead of my own macros. Adding
>> PT_REGS_PARM6(x) in bpf_sample.c because we need to be able to
>> support up to 6 arguments passed by registers.
Looks like you missed Brendan Gregg's prior feedback from v1 [0]. I haven't
seen a strong compelling argument for why this needs to reside in the kernel
tree given we also have all the other tracing tools and many of which also
rely on BPF such as bcc, bpftrace, ply, systemtap, sysdig, lttng to just name
a few. Given all the other tracers manage to live outside the kernel tree just
fine, so can dtrace as well; it's _not_ special in this regard in any way. It
will be tons of code in long term which is better off in its separate project,
and if we add tools/dtrace/, other projects will come as well asking for kernel
tree inclusion 'because tools/dtrace' is now there, too. While it totally makes
sense to extend the missing kernel bits where needed, it doesn't make sense to
have another big tracing project similar to perf in the tree. Therefore, I'm
not applying this patch, sorry.
Thanks,
Daniel
[0] https://lore.kernel.org/bpf/CAE40pdeSfJBpbBHTmwz1xZ+MW02=kJ0krq1mN+EkjSLqf2GX_w@xxxxxxxxxxxxxx/
>> ---
>> MAINTAINERS | 6 +
>> tools/dtrace/Makefile | 87 ++++++++++
>> tools/dtrace/bpf_sample.c | 146 ++++++++++++++++
>> tools/dtrace/dt_bpf.c | 185 ++++++++++++++++++++
>> tools/dtrace/dt_buffer.c | 338 +++++++++++++++++++++++++++++++++++++
>> tools/dtrace/dt_fbt.c | 201 ++++++++++++++++++++++
>> tools/dtrace/dt_hash.c | 211 +++++++++++++++++++++++
>> tools/dtrace/dt_probe.c | 230 +++++++++++++++++++++++++
>> tools/dtrace/dt_syscall.c | 179 ++++++++++++++++++++
>> tools/dtrace/dt_utils.c | 132 +++++++++++++++
>> tools/dtrace/dtrace.c | 249 +++++++++++++++++++++++++++
>> tools/dtrace/dtrace.h | 13 ++
>> tools/dtrace/dtrace_impl.h | 101 +++++++++++
>> 13 files changed, 2078 insertions(+)
>> create mode 100644 tools/dtrace/Makefile
>> create mode 100644 tools/dtrace/bpf_sample.c
>> create mode 100644 tools/dtrace/dt_bpf.c
>> create mode 100644 tools/dtrace/dt_buffer.c
>> create mode 100644 tools/dtrace/dt_fbt.c
>> create mode 100644 tools/dtrace/dt_hash.c
>> create mode 100644 tools/dtrace/dt_probe.c
>> create mode 100644 tools/dtrace/dt_syscall.c
>> create mode 100644 tools/dtrace/dt_utils.c
>> create mode 100644 tools/dtrace/dtrace.c
>> create mode 100644 tools/dtrace/dtrace.h
>> create mode 100644 tools/dtrace/dtrace_impl.h