On Tue, Feb 4, 2025 at 11:35 PM Juntong Deng <juntong.deng@xxxxxxxxxxx> wrote:
This discussion comes from the patch series open-coded BPF file
iterator, which was Nack-ed and thus ended [0].
Thanks for the feedback from Christian, Linus, and Al, all very helpful.
The problems encountered in this patch series may also be encountered in
other BPF open-coded iterators to be added in the future, or in other
BPF usage scenarios.
So maybe this is a good opportunity for us to discuss all of this and
rethink BPF safety, BPF open coded iterators, and possible improvements.
[0]:
https://lore.kernel.org/bpf/AM6PR03MB50801990BD93BFA2297A123599EC2@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#t
What do we expect from BPF safety?
----------------------------------
Christian points out the important fact that BPF programs can hold
references for a long time and cause weird issues.
This is an inherent flaw in BPF. Since the addition of bpf_loop and
BPF open-code iterators, the myth that BPF is "absolutely" safe has
been broken.
The BPF verifier is a static verifier and has no way of knowing how
long a BPF program will actually run.
For example, the following BPF program can freeze your computer, but
can pass the BPF verifier smoothly.
SEC("raw_tp/sched_switch")
int BPF_PROG(on_switch)
{
struct bpf_iter_num it;
int *v;
bpf_iter_num_new(&it, 0, 100000);
while ((v = bpf_iter_num_next(&it))) {
struct bpf_iter_num it2;
bpf_iter_num_new(&it2, 0, 100000);
while ((v = bpf_iter_num_next(&it2))) {
bpf_printk("BPF Bomb\n");
}
bpf_iter_num_destroy(&it2);
}
bpf_iter_num_destroy(&it);
return 0;
}
This BPF program runs a huge loop at each schedule.
bpf_iter_num_new is a common iterator that we can use in almost any
context, including LSM, sched-ext, tracing, etc.
We can run large, long loops on any critical code path and freeze the
system, since the BPF verifier has no way of knowing how long the
iteration will run.
This is completely orthogonal to the issue that Christian explained.
The long runtime of *malicious* bpf progs is a known issue and
there are wip patches to address that.
Then holding references or holding locks in BPF programs doesn't seem
to be a problem?
It's a known issue.
This brings us back to the question at the beginning, what do we expect
from BPF safety?
Safety is paramount.
What do we expect from BPF and BPF open coded iterators?
They are not special. All progs can be exploited if bad actors
try hard enough. Including unprivileged progs like tcpdump.
That's why unpriv is disabled by default.
Would we expect BPF programs to have flexible access to more information
in the kernel?
yes, but the tracing progs must be free of side effects.
Would we expect to have more BPF open-coded iterators allowing BPF
programs to iterate through various data structures in the kernel?
true, but it's nuanced.
What are the boundaries of what we expect BPF to be able to do?
Tracing bpf progs are readonly. If they cause side effects
they must be fixed.
Of course, there may be risks, but maybe those risks can be solved by
improving BPF?
Please help by contributing patches instead of screaming "fire fire".