In order to let unprivileged users load and execute eBPF programs
teach verifier to prevent pointer leaks.
Verifier will prevent
- any arithmetic on pointers
(except R10+Imm which is used to compute stack addresses)
- comparison of pointers
- passing pointers to helper functions
- indirectly passing pointers in stack to helper functions
- returning pointer from bpf program
- storing pointers into ctx or maps
Spill/fill of pointers into stack is allowed, but mangling
of pointers stored in the stack or reading them byte by byte is not.
Within bpf programs the pointers do exist, since programs need to
be able to access maps, pass skb pointer to LD_ABS insns, etc
but programs cannot pass such pointer values to the outside
or obfuscate them.
Only allow BPF_PROG_TYPE_SOCKET_FILTER unprivileged programs,
so that socket filters (tcpdump), af_packet (quic acceleration)
and future kcm can use it.
tracing and tc cls/act program types still require root permissions,
since tracing actually needs to be able to see all kernel pointers
and tc is for root only.
For example, the following unprivileged socket filter program is allowed:
int foo(struct __sk_buff *skb)
{
char fmt[] = "hello %d\n";
bpf_trace_printk(fmt, sizeof(fmt), skb->len);
return 0;
}
but the following program is not:
int foo(struct __sk_buff *skb)
{
char fmt[] = "hello %p\n";
bpf_trace_printk(fmt, sizeof(fmt), fmt);
return 0;
}
since it would leak the kernel stack address via bpf_trace_printk().
Unprivileged socket filter bpf programs have access to the
following helper functions:
- map lookup/update/delete (but they cannot store kernel pointers into them)
- get_random (it's already exposed to unprivileged user space)
- get_smp_processor_id
- tail_call into another socket filter program
- ktime_get_ns
- bpf_trace_printk (for debugging)
The feature is controlled by sysctl kernel.bpf_enable_unprivileged
which is off by default.
New tests were added to test_verifier:
unpriv: return pointer OK
unpriv: add const to pointer OK
unpriv: add pointer to pointer OK
unpriv: neg pointer OK
unpriv: cmp pointer with const OK
unpriv: cmp pointer with pointer OK
unpriv: pass pointer to printk OK
unpriv: pass pointer to helper function OK
unpriv: indirectly pass pointer on stack to helper function OK
unpriv: mangle pointer on stack 1 OK
unpriv: mangle pointer on stack 2 OK
unpriv: read pointer from stack in small chunks OK
unpriv: write pointer into ctx OK
unpriv: write pointer into map elem value OK
unpriv: partial copy of pointer OK
Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxxxx>