[PATCH v6 bpf-next 0/3] libbpf: BTF dumper support for typed data

From: Alan Maguire
Date: Thu Jul 15 2021 - 11:15:57 EST


Add a libbpf dumper function that supports dumping a representation
of data passed in using the BTF id associated with the data in a
manner similar to the bpf_snprintf_btf helper.

Default output format is identical to that dumped by bpf_snprintf_btf()
(bar using tabs instead of spaces for indentation, but the indent string
can be customized also); for example, a "struct sk_buff" representation
would look like this:

(struct sk_buff){
(union){
(struct){
.next = (struct sk_buff *)0xffffffffffffffff,
.prev = (struct sk_buff *)0xffffffffffffffff,
(union){
.dev = (struct net_device *)0xffffffffffffffff,
.dev_scratch = (long unsigned int)18446744073709551615,
},
},
...

Patch 1 implements the dump functionality in a manner similar
to that in kernel/bpf/btf.c, but with a view to fitting into
libbpf more naturally. For example, rather than using flags,
boolean dump options are used to control output. In addition,
rather than combining checks for display (such as is this
field zero?) and actual display - as is done for the kernel
code - the code is organized to separate zero and overflow
checks from type display.

Patch 2 adds ASSERT_STRNEQ() for use in the following BTF dumper
tests.

Patch 3 consists of selftests that utilize a dump printf function
to snprintf the dump output to a string for comparison with
expected output. Tests deliberately mirror those in
snprintf_btf helper test to keep output consistent, but
also cover overflow handling, var/section display.

Changes since v5 [1]
- readjust dump options to avoid unnecessary padding (Andrii, patch 1).
- tidied up bitfield data checking/retrieval using Andrii's suggestions.
Removed code where we adjust data pointer prior to calling bitfield
functions as this adjustment is not needed, provided we use the type
size as the number of bytes to iterate over when retrieving the
full value we apply bit shifting operations to retrieve the bitfield
value. With these chances, the *_int_bits() functions were no longer needed
(Andrii, patch 1).
- coalesced the "is zero" checking for ints, floats and pointers
into btf_dump_base_type_check_zero(), using a memcmp() of the
size of the data. This can be derived from t->size for ints
and floats, and pointer size is retrieved from dump's ptr_sz
field (Andrii, patch 1).
- Added alignment-aware handling for int, enum, float retrieval.
Packed data structures can force ints, enums and floats to be
aligned on different boundaries; for example, the

struct p {
char f1;
int f2;
} __attribute__((packed));

...will have the int f2 field offset at byte 1, rather than at
byte 4 for an unpacked structure. The problem is directly
dereferencing that as an int is problematic on some platforms.
For ints and enums, we can reuse bitfield retrieval to get the
value for display, while for floats we use a local union of the
floating-point types and memcpy into it, ensuring we can then
dereference pointers into that union which will have safe alignment
(Andrii, patch 1).
- added comments to explain why we increment depth prior to displaying
opening parens, and decrement it prior to displaying closing parens
for structs, unions and arrays. The reason is that we don't want
to have a trailing newline when displaying a type. The logic that
handles this says "don't show a newline when the depth we're at is 0".
For this to work for opening parens then we need to bump depth before
showing opening parens + newline, and when we close out structure
we need to show closing parens after reducing depth so that we don't
append a newline to a top-level structure. So as a result we have

struct foo {\n
struct bar {\n
}\n
}

- silently truncate provided indent string with strncat() if > 31 bytes
(Andrii, patch 1).
- fixed ASSERT_STRNEQ() macro to show only n bytes of string
(Andrii, patch 2).
- fixed strncat() of type data string to avoid stack corruption
(Andrii, patch 3).
- removed early returns from dump type tests (Andrii, patch 3).
- have tests explicitly specify prefix (enum, struct, union)
(Andrii, patch 3).
- switch from CHECK() to ASSERT_* where possible (Andrii, patch 3).

Changes since v4 [2]
- Andrii kindly provided code to unify emitting a prepended cast
(for example "(int)") with existing code, and this had the nice
benefit of adding array indices in type specifications (Andrii,
patches 1, 3)
- Fixed indent_str option to make it a const char *, stored in a
fixed-length buffer internally (Andrii, patch 1)
- Reworked bit shift logic to minimize endian-specific interactions,
and use same macros as found elsewhere in libbpf to determine endianness
(Andrii, patch 1)
- Fixed type emitting to ensure that a trailing '\n' is not displayed;
newlines are added during struct/array display, but for a single type
the last character is no longer a newline (Andrii, patches 1, 3)
- Added support for ASSERT_STRNEQ() macro (Andrii, patch 2)
- Split tests into subtests for int, char, enum etc rather than one
"dump type data" subtest (Andrii, patch 3)
- Made better use of ASSERT* macros (Andrii, patch 3)
- Got rid of some other TEST_* macros that were unneeded (Andrii, patch 3)
- Switched to using "struct fs_context" to verify enum bitfield values
(Andrii, patch 3)

Changes since v3 [3]
- Retained separation of emitting of type name cast prefixing
type values from existing functionality such as btf_dump_emit_type_chain()
since initial code-shared version had so many exceptions it became
hard to read. For example, we don't emit a type name if the type
to be displayed is an array member, we also always emit "forward"
definitions for structs/unions that aren't really forward definitions
(we just want a "struct foo" output for "(struct foo){.bar = ...".
We also always ignore modifiers const/volatile/restrict as they
clutter output when emitting large types.
- Added configurable 4-char indent string option; defaults to tab
(Andrii)
- Added support for BTF_KIND_FLOAT and associated tests (Andrii)
- Added support for BTF_KIND_FUNC_PROTO function pointers to
improve output of "ops" structures; for example:

(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.llseek = (loff_t(*)(struct file *, loff_t, int))0xffffffffffffffff,
...
Added associated test also (Andrii)
- Added handling for enum bitfields and associated test (Andrii)
- Allocation of "struct btf_dump_data" done on-demand (Andrii)
- Removed ".field = " output from function emitting type name and
into caller (Andrii)
- Removed BTF_INT_OFFSET() support (Andrii)
- Use libbpf_err() to set errno for error cases (Andrii)
- btf_dump_dump_type_data() returns size written, which is used
when returning successfully from btf_dump__dump_type_data()
(Andrii)

Changes since v2 [4]
- Renamed function to btf_dump__dump_type_data, reorganized
arguments such that opts are last (Andrii)
- Modified code to separate questions about display such
as have we overflowed?/is this field zero? from actual
display of typed data, such that we ask those questions
separately from the code that actually displays typed data
(Andrii)
- Reworked code to handle overflow - where we do not provide
enough data for the type we wish to display - by returning
-E2BIG and attempting to present as much data as possible.
Such a mode of operation allows for tracers which retrieve
partial data (such as first 1024 bytes of a
"struct task_struct" say), and want to display that partial
data, while also knowing that it is not the full type.
Such tracers can then denote this (perhaps via "..." or
similar).
- Explored reusing existing type emit functions, such as
passing in a type id stack with a single type id to
btf_dump_emit_type_chain() to support the display of
typed data where a "cast" is prepended to the data to
denote its type; "(int)1", "(struct foo){", etc.
However the task of emitting a
".field_name = (typecast)" did not match well with model
of walking the stack to display innermost types first
and made the resultant code harder to read. Added a
dedicated btf_dump_emit_type_name() function instead which
is only ~70 lines (Andrii)
- Various cleanups around bitfield macros, unneeded member
iteration macros, avoiding compiler complaints when
displaying int da ta by casting to long long, etc (Andrii)
- Use DECLARE_LIBBPF_OPTS() in defining opts for tests (Andrii)
- Added more type tests, overflow tests, var tests and
section tests.

Changes since RFC [5]
- The initial approach explored was to share the kernel code
with libbpf using #defines to paper over the different needs;
however it makes more sense to try and fit in with libbpf
code style for maintenance. A comment in the code points at
the implementation in kernel/bpf/btf.c and notes that any
issues found in it should be fixed there or vice versa;
mirroring the tests should help with this also
(Andrii)

[1] https://lore.kernel.org/bpf/1624092968-5598-1-git-send-email-alan.maguire@xxxxxxxxxx/
[2] https://lore.kernel.org/bpf/CAEf4BzYtbnphCkhz0epMKE4zWfvSOiMpu+-SXp9hadsrRApuZw@xxxxxxxxxxxxxx/T/
[3] https://lore.kernel.org/bpf/1622131170-8260-1-git-send-email-alan.maguire@xxxxxxxxxx/
[4] https://lore.kernel.org/bpf/1610921764-7526-1-git-send-email-alan.maguire@xxxxxxxxxx/
[5] https://lore.kernel.org/bpf/1610386373-24162-1-git-send-email-alan.maguire@xxxxxxxxxx/

Alan Maguire (3):
libbpf: BTF dumper support for typed data
selftests/bpf: add ASSERT_STRNEQ() variant for test_progs
selftests/bpf: add dump type data tests to btf dump tests

tools/lib/bpf/btf.h | 19 +
tools/lib/bpf/btf_dump.c | 819 +++++++++++++++++++++-
tools/lib/bpf/libbpf.map | 1 +
tools/testing/selftests/bpf/prog_tests/btf_dump.c | 600 ++++++++++++++++
tools/testing/selftests/bpf/test_progs.h | 12 +
5 files changed, 1446 insertions(+), 5 deletions(-)

--
1.8.3.1