Re: [PATCH bpf-next 2/3] bpf: btf: add btf json print functionality

From: Jakub Kicinski
Date: Tue Jun 26 2018 - 16:31:50 EST


On Tue, 26 Jun 2018 17:48:22 +0100, Okash Khawaja wrote:
> On Fri, Jun 22, 2018 at 05:26:39PM -0700, Martin KaFai Lau wrote:
> > On Fri, Jun 22, 2018 at 04:32:00PM -0700, Jakub Kicinski wrote:
> > > On Fri, 22 Jun 2018 15:54:08 -0700, Martin KaFai Lau wrote:
> > > > > > > > > > > > > "value": ["0x02","0x00","0x00","0x00","0x00","0x00","0x00","0x00"
> > > > > > > > > > > > > ],
> > > > > > > > > > > > > "value_struct":{
> > > > > > > > > > > > > "src_ip":2,
> > > > > > > > If for the same map the user changes the "src_ip" to an array of int[4]
> > > > > > > > later (e.g. to support ipv6), it will become "src_ip": [1, 2, 3, 4].
> > > > > > > > Is it breaking backward compat?
> > > > > > > > i.e.
> > > > > > > > struct five_tuples {
> > > > > > > > - int src_ip;
> > > > > > > > + int src_ip[4];
> > > > > > > > /* ... */
> > > > > > > > };
> > > > > > >
> > > > > > > Well, it is breaking backward compat, but it's the program doing it,
> > > > > > > not bpftool :) BTF changes so does the output.
> > > > > > As we see, the key/value's btf-output is inherently not backward compat.
> > > > > > Hence, "-j" and "-p" will stay as is. The whole existing json will
> > > > > > be backward compat instead of only partly backward compat.
> > > > >
> > > > > No. There is a difference between user of a facility changing their
> > > > > input and kernel/libraries providing different output in response to
> > > > > that, and the libraries suddenly changing the output on their own.
> > > > >
> > > > > Your example is like saying if user started using IPv6 addresses
> > > > > instead of IPv4 the netlink attributes in dumps will be different so
> > > > > kernel didn't keep backwards compat. While what you're doing is more
> > > > > equivalent to dropping support for old ioctl interfaces because there
> > > > > is a better mechanism now.
> > > > Sorry, I don't follow this. I don't see netlink suffer json issue like
> > > > the one on "key" and "value".
> > > >
> > > > All I can grasp is, the json should normally be backward compat but now
> > > > we are saying anything added by btf-output is an exception because
> > > > the script parsing it will treat it differently than "key" and "value"
> > >
> > > Backward compatibility means that if I run *the same* program against
> > > different kernels/libraries it continues to work. If someone decides
> > > to upgrade their program to work with IPv6 (which was your example)
> > > obviously there is no way system as a whole will look 1:1 the same.
> > >
> > > > > BTF in JSON is very useful, and will help people who writes simple
> > > > > orchestration/scripts based on bpftool *a* *lot*. I really appreciate
> > > > Can you share what the script will do? I want to understand why
> > > > it cannot directly use the BTF format and the map data.
> > >
> > > Think about a python script which wants to read a counter in a map.
> > > Right now it would have to get the BTF, find out which bytes are the
> > > counter, then convert the bytes into a larger int. With JSON BTF if
> > > just does entry["formatted"]["value"]["counter"].
> > >
> > > Real life example from my test code (conversion of 3 element counter
> > > array):
> > >
> > > def str2int(strtab):
> > > inttab = []
> > > for i in strtab:
> > > inttab.append(int(i, 16))
> > > ba = bytearray(inttab)
> > > if len(strtab) == 4:
> > > fmt = "I"
> > > elif len(strtab) == 8:
> > > fmt = "Q"
> > > else:
> > > raise Exception("String array of len %d can't be unpacked to an int" %
> > > (len(strtab)))
> > > return struct.unpack(fmt, ba)[0]
> > >
> > > def convert(elems, idx):
> > > val = []
> > > for i in range(3):
> > > part = elems[idx]["value"][i * length:(i + 1) * length]
> > > val.append(str2int(part))
> > > return val
> > >
> > > With BTF it would be:
> > >
> > > elems[idx]["formatted"]["value"]
> > >
> > > Which is fairly awesome.
> > Thanks for the example. Agree that with BTF, things are easier in general.
> >
> > btw, what more awesome is,
> > #> bpftool map find id 100 key 1
> > {
> > "counter_x": 1,
> > "counter_y": 10
> > }
> >
> > >
> > > > > this addition to bpftool and will start using it myself as soon as it
> > > > > lands. I'm not sure why the reluctance to slightly change the output
> > > > > format?
> > > > The initial change argument is because the json has to be backward compat.
> > > >
> > > > Then we show that btf-output is inherently not backward compat, so
> > > > printing it in json does not make sense at all.
> > > >
> > > > However, now it is saying part of it does not have to be backward compat.
> > >
> > > Compatibility of "formatted" member is defined as -> fields broken out
> > > according to BTF. So it is backward compatible. The definition of
> > > "value" member is -> an array of unfortunately formatted array of
> > > ugly hex strings :(
> > >
> > > > I am fine putting it under "formatted" for "-j" or "-p" if that is the
> > > > case, other than the double output is still confusing. Lets wait for
> > > > Okash's input.
> > > >
> > > > At the same time, the same output will be used as the default plaintext
> > > > output when BTF is available. Then the plaintext BTF output
> > > > will not be limited by the json restrictions when we want
> > > > to improve human readability later. Apparently, the
> > > > improvements on plaintext will not be always applicable
> > > > to json output.
> > >
>
> hi,
>
> so i guess following is what we want:
>
> 1. a "formatted" object nested inside -p and -j switches for bpf map
> dump. this will be JSON and backward compatible
> 2. an output for humans - which is like the current output. this will
> not be JSON. this won't have to be backward compatible. this output will
> be shown when neither of -j and -p are supplied and btf info is
> available.
>
> i can update the patches to v2 which covers 2 above + all other comments
> on the patchset. later we can follow up with a patch for 1.

Please do both at the same time. I've learnt not to trust people when
they say things like "we can follow up with xyz" :( In my experience it
_always_ backfires.

Implementing both outputs in one series will help you structure your
code to best suit both of the formats up front.