Re: [RFC PATCH bpf-next 1/2] bpf: share BTF "show" implementation between kernel and libbpf

From: Alan Maguire
Date: Thu Jan 14 2021 - 10:39:29 EST


On Mon, 11 Jan 2021, Andrii Nakryiko wrote:

> On Mon, Jan 11, 2021 at 9:34 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
> > Currently the only "show" function for userspace is to write the
> > representation of the typed data to a string via
> >
> > LIBBPF_API int
> > btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj,
> > __u64 flags);
> >
> > ...but other approaches could be pursued including printf()-based
> > show, or even a callback mechanism could be supported to allow
> > user-defined show functions.
> >
>
> It's strange that you saw btf_dump APIs, and yet decided to go with
> this API instead. snprintf() is not a natural "method" of struct btf.
> Using char buffer as an output is overly restrictive and inconvenient.
> It's appropriate for kernel and BPF program due to their restrictions,
> but there is no need to cripple libbpf APIs for that. I think it
> should follow btf_dump APIs with custom callback so that it's easy to
> just printf() everything, but also user can create whatever elaborate
> mechanism they need and that fits their use case.
>
> Code reuse is not the ultimate goal, it should facilitate
> maintainability, not harm it. There are times where sharing code
> introduces unnecessary coupling and maintainability issues. And I
> think this one is a very obvious case of that.
>

Okay, so I've been exploring adding dumper API support. The initial
approach I've been using is to provide an API like this:

/* match show flags for bpf_show_snprintf() */
enum {
BTF_DUMP_F_COMPACT = (1ULL << 0),
BTF_DUMP_F_NONAME = (1ULL << 1),
BTF_DUMP_F_ZERO = (1ULL << 3),
};

struct btf_dump_emit_type_data_opts {
/* size of this struct, for forward/backward compatibility */
size_t sz;
void *data;
int indent_level;
__u64 flags;
};
#define btf_dump_emit_type_data_opts__last_field flags

LIBBPF_API int
btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
const struct btf_dump_emit_type_data_opts *opts);


...so the opts play a similiar role to the struct btf_ptr + flags
in bpf_snprintf_btf. I've got this working, but the current
implementation is tied to emitting the same C-based syntax as
bpf_snprintf_btf(); though of course the printf function is invoked.
So a use case looks something like this:

struct btf_dump_emit_type_data_opts opts;
char skbufmem[1024], skbufstr[8192];
struct btf *btf = libbpf_find_kernel_btf();
struct btf_dump *d;
__s32 skbid;
int indent = 0;

memset(skbufmem, 0xff, sizeof(skbufmem));
opts.data = skbufmem;
opts.sz = sizeof(opts);
opts.indent_level = indent;

d = btf_dump__new(btf, NULL, NULL, printffn);

skbid = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT);
if (skbid < 0) {
fprintf(stderr, "no skbuff, err %d\n", skbid);
exit(1);
}

btf_dump__emit_type_data(d, skbid, &opts);


..and we get output of the form

(struct sk_buff){
(union){
(struct){
.next = (struct sk_buff *)0xffffffffffffffff,
.prev = (struct sk_buff *)0xffffffffffffffff,
(union){
.dev = (struct net_device *)0xffffffffffffffff,
.dev_scratch = (long unsigned int)18446744073709551615,
},
},
...

etc. However it would be nice to find a way to help printf function
providers emit different formats such as JSON without having to
parse the data they are provided in the printf function.
That would remove the need for the output flags, since the printf
function provider could control display.

If we provided an option to provider a "kind" printf function,
and ensured that the BTF dumper sets a "kind" prior to each
_internal_ call to the printf function, we could use that info
to adapt output in various ways. For example, consider the case
where we want to emit C-type output. We can use the kind
info to control output for various scenarios:

void c_dump_kind_printf(struct btf_dump *d, enum btf_dump_kind kind,
void *ctx, const char *fmt, va_list args)
{
switch (kind) {
case BTF_DUMP_KIND_TYPE_NAME:
/* For C, add brackets around the type name string ( ) */
btf_dump__printf(d, "(");
btf_dump__vprintf(d, fmt, args);
btf_dump__printf(d, ")");
break;
case BTF_DUMP_KIND_MEMBER_NAME:
/* for C, prefix a "." to member name, suffix a "=" */
btf_dump__printf(d, ".");
btf_dump__vprintf(d, fmt, args);
btf_dump__printf(d, " = ");
break;
...

Whenever we internally call btf_dump_kind_printf() - and have
a kind printf function - it is invoked, and once it's added formatting
it invokes the printf function. So there are two layers of callbacks

- the kind callback determines what we print based on the kinds
of objects provided (type names, member names, type data, etc); and
- the printf callback determines _how_ we print (e.g. to a file, stdout,
etc).

The above suggests we'd need to add btf_dump__*printf() functions.

This might allow us to refactor bpftool such that the
type traversal code lived in libbpf, while the specifics of
how that info is to be dumped live in bpftool. We'd probably
need to provide a C-style kind dumper out of the box in libbpf
as a default mechanism.

What do you think?

Alan