[PATCH V2 0/1] tools/dtrace: initial implementation of DTrace

From: Kris Van Hees
Date: Wed Jul 10 2019 - 11:39:01 EST


This is version 2 of the patch, incorporating feedback from Peter Zijlstra and
Arnaldo Carvalho de Melo.

Changes in Makefile:
- Remove -I$(srctree)/tools/perf from KBUILD_HOSTCFLAGS since it
is not actually used.

Changes in dt_bpf.c:
- Remove unnecessary PERF_EVENT_IOC_ENABLE.

Changes in dt_buffer.c:
- Use ring_buffer_read_head() and ring_buffer_write_tail() to
avoid use of volatile.
- Handle perf events that wrap around the ring buffer boundary.
- Remove unnecessary PERF_EVENT_IOC_ENABLE.

Changes in bpf_sample.c:
- Use PT_REGS_PARM1(x), etc instead of my own macros. Adding
PT_REGS_PARM6(x) in bpf_sample.c because we need to be able to
support up to 6 arguments passed by registers.

This patch is also available, applied to bpf-next, at the following URL:

https://github.com/oracle/dtrace-linux-kernel/tree/dtrace-bpf

As suggested in feedback to my earlier patch submissions, this code takes an
approach to avoid kernel code changes as much as possible. The current patch
does not involve any kernel code changes. Further development of this code
will continue with this approach, incrementally adding features to this first
minimal implementation. The goal is a fully featured and functional DTrace
implementation involving kernel changes only when strictly necessary.

The code presented here supports two very basic functions:

1. Listing probes that are used in BPF programs

# dtrace -l -s bpf_sample.o
ID PROVIDER MODULE FUNCTION NAME
18876 fbt vmlinux ksys_write entry
70423 syscall vmlinux write entry

2. Loading BPF tracing programs and collecting data that they generate

# dtrace -s bpf_sample.o
CPU ID
15 70423 0xffff8c0968bf8ec0 0x00000000000001 0x0055e019eb3f60 0x0000000000002c
15 18876 0xffff8c0968bf8ec0 0x00000000000001 0x0055e019eb3f60 0x0000000000002c
...

Only kprobes and syscall tracepoints are supported since this is an initial
patch. It does show the use of a generic BPF function to implement the actual
probe action, called from two distinct probe types. Follow-up patches will
add more probe types, add more tracing features from the D language, add
support for D script compilation to BPF, etc.

The implementation makes use of libbpf for handling BPF ELF objects, and uses
the perf event output ring buffer (supported through BPF) to retrieve the
tracing data. The next step in development will be adding support to libbpf
for programs using shared functions from a collection of functions included in
the BPF ELF object (as suggested by Alexei).

The code is structured as follows:
tools/dtrace/dtrace.c = command line utility
tools/dtrace/dt_bpf.c = interface to libbpf
tools/dtrace/dt_buffer.c = perf event output buffer handling
tools/dtrace/dt_fbt.c = kprobes probe provider
tools/dtrace/dt_syscall.c = syscall tracepoint probe provider
tools/dtrace/dt_probe.c = generic probe and probe provider handling code
This implements a generic interface to the actual
probe providers (dt_fbt and dt_syscall).
tools/dtrace/dt_hash.c = general probe hashing implementation
tools/dtrace/dt_utils.c = support code (manage list of online CPUs)
tools/dtrace/dtrace.h = API header file (used by BPF program source code)
tools/dtrace/dtrace_impl.h = implementation header file
tools/dtrace/bpf_sample.c = sample BPF program using two probe types

I included an entry for the MAINTAINERS file. I offer to actively maintain
this code, and to keep advancing its development.

Cheers,
Kris Van Hees