ANNOUNCE: bpfd - a remote proxy daemon for executing bpf code (with corres. bcc changes)

From: Joel Fernandes
Date: Fri Dec 29 2017 - 03:58:37 EST


Hi Guys,

I've been working on an idea I discussed with other kernel developers
(Alexei, Josef etc) last LPC about how to make it easier to run bcc
tools on remote systems.

Use case
========
Run bcc tools on a remotely connected system without having to load
the entire LLVM infrastructure onto the remote target and have to sync
the kernel sources with it.
On architecture such as ARM64 especially, its a bit more work if you
were to run the tools directly on the target itself (local to the
target) because LLVM and Python have to be cross-compiled for it
(along with syncing of kernel sources which takes up space and needs
to be kept sync'ed for correct operation). I believe Facebook also has
some usecases where they want to run bcc tools on remote instances.
Lastly this is also the way arm64 development normally happens, you
cross build for it and typically the ARM64 embedded systems may not
have much space for kernel sources and clang so its better some times
if the tools are remote. All our kernel development for android is
cross developed with the cross-toolchain running remotely as well.
I am looking forward to collaborating with interested developers on
this idea and getting more feedback about the design etc. I am also
planning to talk about it next year during SCALE and OSPM.

Implementation
==============
To facilitate this, I started working on a daemon called bpfd which is
executed on the remote target and listening for commands:
https://github.com/joelagnel/bpfd
The daemon does more than proxy the bpf syscall, there's several
things like registering a kprobe with perf, and perf callbacks that
need to be replicated. All this infrastructure is pretty much code
complete in bpfd.

Sample commands sent to bpfd are as follows:
https://github.com/joelagnel/bpfd/blob/master/tests/TESTS
------------------------
; Program opensnoop
BPF_CREATE_MAP 1 8 40 10240 0
BPF_CREATE_MAP 4 4 4 2 0

BPF_PROG_LOAD 2 248 GPL 264721 eRdwAAAAAAC3AQAAAAAAAHsa+P8AA[...]
BPF_PROG_LOAD 2 664 GPL 264721 vxYAAAAAAACFAAAADgAAAHsK+P8AA[...]
------------------------
Binary streams are communicated using base64 making it possible to
keep interaction with binary simple.

Several patches is written on the bcc side to be able to send these
commands using a "remotes infrastructure", available in the branch at:
https://github.com/joelagnel/bcc/commits/bcc-bpfd
My idea was to keep the remote infrastructure as generic/plug-and-play
as possible - so in the future its easy to add other remotes like
networking. Currently I've adb (android bridge) remote and a shell
remote: https://github.com/joelagnel/bcc/tree/bcc-bpfd/src/python/bcc/remote

The shell remote is more of a "test" remote that simply forks bpfd and
communicates with it over stdio. This makes the development quite
easy.

Status
======
What's working:
- executing several bcc tools across process boundary using "shell"
remote (bcc tools and bpfd both running on local x86 machine).
- communication with remote arm64 android target using the "adb
remote". But there are several issues to do with arm64 and bcc tools
that I'm ironing out. Since my arm64 bcc hackery is a bit recent, I
created a separate WIP branch here:
https://github.com/joelagnel/bcc/tree/bcc-bpfd-arm64. I don't suspect
these to be a major issue since I noticed some folks have been using
bcc tools on arm64 already.

Issues:
- Since bcc is building with clang on x86 - the eBPF backend code is
generated for x86. Although it loads fine on arm64, there seem several
issues such as kprobe handler doesn't see arguments or return code
correctly in opensnoop. This is (probably)easy to fix by just user
telling bcc we're build for a certain architecture - but that would
mean we carry code for each arch when building the bcc libraries and
dynamically select the code path to run - than building for the C++
compiler's target architecture.
- Some operations are quite slow, such as stackcount when the number
of stack traces are a lot. Each stack trace is a key and and every key
iterated is at a cost, which adds up. Maybe we can batch these up so
that they're faster instead of making each key iteration a separate
remote command/response?
- Some tools read the ps table on the local host. This needs to be
remotely proxied.
- Provide mechanism to make bcc/clang build eBPF for arm64 (using a
command line switch) ?
- Design a generic parser mechanism to be added to all bcc tools to be
able to pass which remote method to use, what the remote architecture
is and what the path to the kernel sources are (for kprobes to work)

Thanks a lot to Alexei for discussing ideas in conference and for all
the great advice and help.

Regards,

- Joel

PS: We also have some usecases where our Android networking daemon has
hardcoded eBPF asm and our teams want to write them C and load the
binary stream. It seems bpfd can be a good fit here as well.