Re: binary blob no more! Was: [RFC PATCH tip 0/5] tracing filterswith BPF

From: Steven Rostedt
Date: Mon Dec 09 2013 - 10:02:59 EST


On Sun, 8 Dec 2013 19:36:18 -0800
Alexei Starovoitov <ast@xxxxxxxxxxxx> wrote:


> Actually I think there are few ways to include the source equivalent
> in bpf image.
>
> Approach #1
> include original C source code into bpf image:
> bpf_image = bpf_insns + original_C
> this will imply that C code can have #include's of linux kernel headers only
> and it can only be C source.
> this way the user can do 'cat /sys/kernel/debug/bpf/filter', kernel
> will print original_C and these restrictions will guarantee that it
> will compile into similar bpf code whether gcc or llvm compiler is
> used.
>
> Approach #2
> include original llvm bitcode:
> bpf_image = bpf_insns + llvm_bc
> The user can do 'cat .../filter' and use llvm-dis to see human readable bitcode.
> It takes practice to read it, but it's high level enough to understand
> what filter is doing. llvm-llc can be used to generate bpf_insns
> again, or generate C from bitcode.
> Pro vs 1: bitcode is very compact
> Con: only llvm compiler can used to generate bpf instructions
>
> Enforcement can be done by having a user space daemon that
> walks over all loaded filters and recompiles them from C or from bitcode.
>
> Please let me know which approach you prefer.

I don't like either. And different compilers may produce different
results, so that daemon may not be able to verify what is in the C code
is really what's in the bitcode.

>
> I still think that bpf_image = bpf_insns + license_string is just as good,
> since bpf code can only call tiny set of functions, so no matter what
> the code does its scope is very limited and license enforcement
> guarantees that original source has to be available,
> but I'm ok whichever way.

I like that approach much better. That is, all binary code must state
that it is under the GPL. That way, if you give a binary to someone,
you must also supply the source under the GPL license.

Having a disassembler in the kernel to see what code is loaded, adds
the added benefit that you can see what is there. We can have a
userspace tool to make even more sense out of the disassembled code.

I don't think the kernel should have anything more than a disassembler
though. Maybe that's even too much, but at least a human can inspect it
a little without needing extra tools.

>
> Also please indicate whether gcc or llvm backend is preferred to
> be hosted in tools.

If we end up placing a compiler in tools, than that compiler should
also be able to be used to compile the entire kernel.

Maybe we will finally get our kcc ;-)

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/