Re: BUG: unable to handle kernel paging request in bpf_dispatcher_xdp

From: Daniel Borkmann
Date: Fri Dec 09 2022 - 18:32:27 EST


On 12/10/22 12:07 AM, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 11:41:11PM +0100, Daniel Borkmann wrote:
On 12/9/22 10:53 PM, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 12:31:06PM -0800, Yonghong Song wrote:


On 12/9/22 7:20 AM, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 02:50:55PM +0100, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 12:22:37PM +0100, Jiri Olsa wrote:

SBIP


I'm trying to understand the severity of the issues and
whether we need to revert that commit asap since the merge window
is about to start.

Jiri, Peter,

ping.

cc-ing Thorsten, since he's tracking it now.

The config has CONFIG_X86_KERNEL_IBT=y.
Is it related?

sorry for late reply.. I still did not find the reason,
but I did not try with IBT yet, will test now

no difference with IBT enabled, can't reproduce the issue


ok, scratch that.. the reproducer got stuck on wifi init :-\

after I fix that I can now reproduce on my local config with
IBT enabled or disabled.. it's something else

I'm getting the error also when reverting the static call change,
looking for good commit, bisecting

I'm getting fail with:
f0c4d9fc9cc9 (tag: v6.1-rc4) Linux 6.1-rc4

v6.1-rc1 is ok

so far I narrowed it down between rc1 and rc3.. bisect got me nowhere so far

attaching some more logs

looking at the code.. how do we ensure that code running through
bpf_prog_run_xdp will not get dispatcher image changed while
it's being exetuted

we use 'the other half' of the image when we add/remove programs,
but could bpf_dispatcher_update race with bpf_prog_run_xdp like:


cpu 0: cpu 1:

bpf_prog_run_xdp
...
bpf_dispatcher_xdp_func
start exec image at offset 0x0

bpf_dispatcher_update
update image at offset 0x800
bpf_dispatcher_update
update image at offset 0x0

still in image at offset 0x0


that might explain why I wasn't able to trigger that on
bare metal just in qemu

I tried patch below and it fixes the issue for me and seems
to confirm the race above.. but not sure it's the best fix

jirka


---
diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c
index c19719f48ce0..6a2ced102fc7 100644
--- a/kernel/bpf/dispatcher.c
+++ b/kernel/bpf/dispatcher.c
@@ -124,6 +124,7 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs)
}
__BPF_DISPATCHER_UPDATE(d, new ?: (void *)&bpf_dispatcher_nop_func);
+ synchronize_rcu_tasks();
if (new)
d->image_off = noff;

This might work. In arch/x86/kernel/alternative.c, we have following
code and comments. For text_poke, synchronize_rcu_tasks() might be able
to avoid concurrent execution and update.

so my idea was that we need to ensure all the current callers of
bpf_dispatcher_xdp_func (which should have rcu read lock, based
on the comment in bpf_prog_run_xdp) are gone before and new ones
execute the new image, so the next call to the bpf_dispatcher_update
will be safe to overwrite the other half of the image

If v6.1-rc1 was indeed okay, then it looks like this may be related to
the trampoline patching for the static_call? Did it repro on v6.1-rc1
just with dbe69b299884 ("bpf: Fix dispatcher patchable function entry
to 5 bytes nop") cherry-picked?

I'll try that.. it looks to me like the problem was always there,
maybe harder to trigger.. also to reproduce it you need to call
bpf_dispatcher_update heavily, which is not probably the common
use case

one other thing is that I think the fix might need rcu locking
on the bpf_dispatcher_xdp_func side, because local_bh_disable
seems not to be enough to make synchronize_rcu_tasks work

I'm now testing patch below

jirka

---
diff --git a/include/linux/filter.h b/include/linux/filter.h
index efc42a6e3aed..a27245b96d6b 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -772,7 +772,13 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
* under local_bh_disable(), which provides the needed RCU protection
* for accessing map entries.
*/
- u32 act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp));
+ u32 act;
+
+ rcu_read_lock();
+
+ act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp));
+
+ rcu_read_unlock();

fwiw, these should not be necessary, Documentation/RCU/checklist.rst :

[...] One example of non-obvious pairing is the XDP feature in networking,
which calls BPF programs from network-driver NAPI (softirq) context. BPF
relies heavily on RCU protection for its data structures, but because the
BPF program invocation happens entirely within a single local_bh_disable()
section in a NAPI poll cycle, this usage is safe. The reason that this usage
is safe is that readers can use anything that disables BH when updaters use
call_rcu() or synchronize_rcu(). [...]

if (static_branch_unlikely(&bpf_master_redirect_enabled_key)) {
if (act == XDP_TX && netif_is_bond_slave(xdp->rxq->dev))
diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c
index c19719f48ce0..6a2ced102fc7 100644
--- a/kernel/bpf/dispatcher.c
+++ b/kernel/bpf/dispatcher.c
@@ -124,6 +124,7 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs)
}
__BPF_DISPATCHER_UPDATE(d, new ?: (void *)&bpf_dispatcher_nop_func);
+ synchronize_rcu_tasks();
if (new)
d->image_off = noff;