Re: [PATCH 1/2] bpf: add a bpf_override_function helper

From: Alexei Starovoitov
Date: Sun Nov 12 2017 - 01:50:21 EST


On 11/11/17 4:14 PM, Ingo Molnar wrote:

* Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:

On Fri, Nov 10, 2017 at 10:34:59AM +0100, Ingo Molnar wrote:

* Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:

@@ -551,6 +578,10 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
return &bpf_get_stackid_proto;
case BPF_FUNC_perf_event_read_value:
return &bpf_perf_event_read_value_proto;
+ case BPF_FUNC_override_return:
+ pr_warn_ratelimited("%s[%d] is installing a program with bpf_override_return helper that may cause unexpected behavior!",
+ current->comm, task_pid_nr(current));
+ return &bpf_override_return_proto;

So if this new functionality is used we'll always print this into the syslog?

The warning is also a bit passive aggressive about informing the user: what
unexpected behavior can happen, what is the worst case?


It's modeled after the other warnings bpf will spit out, but with this feature
you are skipping a function and instead returning some arbitrary value, so
anything could go wrong if you mess something up. For instance I screwed up my
initial test case and made every IO submitted return an error instead of just on
the one file system I was attempting to test, so all sorts of hilarity ensued.

Ok, then for the x86 bits:

NAK-ed-by: Ingo Molnar <mingo@xxxxxxxxxx>

One of the major advantages of having an in-kernel BPF sandbox is to never crash
the kernel - and allowing BPF programs to just randomly modify the return value of
kernel functions sounds immensely broken to me.

(And yes, I realize that kprobes are used here as a vehicle, but the point
remains.)

yeah. modifying arbitrary function return pushes bpf outside of
its safety guarantees and in that sense doing the same
override_return could be done from a kernel module if kernel
provides the x64 side of the facility introduced by this patch.
On the other side adding parts of this feature to the kernel only
to be used by external kernel module is quite ugly too and not
something that was ever done before.
How about we restrict this bpf_override_return() only to the functions
which callers expect to handle errors ?
We can add something similar to NOKPROBE_SYMBOL(). Like
ALLOW_RETURN_OVERRIDE() and on btrfs side mark the functions
we're going to test with this feature.
Then 'not crashing kernel' requirement will be preserved.
btrfs or whatever else we will be testing with override_return
will be functioning in 'stress test' mode and if bpf program
is not careful and returns error all the time then one particular
subsystem (like btrfs) will not be functional, but the kernel
will not be crashing.
Thoughts?