Re: [PATCH v2 net-next 0/2] split BPF out of core networking

From: Alexei Starovoitov
Date: Mon Jun 02 2014 - 15:02:16 EST


On Mon, Jun 2, 2014 at 10:04 AM, Daniel Borkmann <dborkman@xxxxxxxxxx> wrote:
> On 06/02/2014 05:41 PM, Alexei Starovoitov wrote:
> ...
>
>> Glad you brought up this point :)
>> 100% agree that current double verification done by seccomp is far from
>> being generic and quite hard to maintain, since any change done to
>> classic BPF verifier needs to be thought through from
>> seccomp_check_filter()
>> perspective as well.
>
>
> Glad we're on the same page.
>
>
>> BPF's input context, set of allowed calls need to be expressed in a
>> generic way.
>> Obviously this split by itself won't make classic BPF all of a sudden
>> generic.
>> It rather defines a boundary of eBPF core.
>
>
> Note, I'm not at all against using it in tracing, I think it's probably
> a good idea, but shouldn't we _first_ think about how to overcome such
> deficits as above by improving upon its in-kernel API design, thus to
> better prepare it to be generic? I feel this step is otherwise just
> skipped and quickly 'hacked' around ... ;)

Are you talking about classic 'deficit' or eBPF 'deficit' ?
Classic has all sorts of hard coded assumptions. The whole
concept of 'load from magic constant' to mean different things
is flawed. We all got used to it and now think that it's normal
for "ld_abs -4056" to mean "a ^= x"
This split is not trying to make classic easier to hack.
With eBPF underneath classic, it got a lot easier to add extensions
to classic, but we shouldn't be doing it.
Classic BPF is not generic and cannot become one. It's eBPF's job.

The split is mainly helping to clearly see the boundary of eBPF core
vs its socket use case. It doesn't change or add any API.
We need to carefully design eBPF APIs when we expose it
to user space. I have a proposal for that too, but that's separate
discussion.
In terms of in-kernel eBPF API there is nothing to be done.
eBPF program 'prog' is generated by whatever means and then:
struct sk_filter *fp;

fp = kzalloc(sk_filter_size(prog_len), GFP_KERNEL);
memcpy(fp->insni, prog, prog_len * sizeof(fp->insni[0]));
fp->len = prog_len;

sk_filter_select_runtime(fp); // select interpreter or JIT
SK_RUN_FILTER(fp, ctx); // run the program
sk_filter_free(fp); // free program

that's how sockets, testsuite, seccomp, tracing are doing it.
All have different ways of producing 'prog' and 'prog_len'.
This in-kernel API cleanup was done in commit 5fe821a9dee2
You even acked it back then :)

If you're referring to eBPF verifier in-kernel API then yeah, it's
missing, just like the whole eBPF verifier :)
Ideally any kernel component that generates eBPF on the fly
sends eBPF program to verifier first just to double check
that generated program is valid.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/