Re: BUG: unable to handle kernel paging request in bpf_dispatcher_xdp

From: Yonghong Song
Date: Fri Dec 09 2022 - 15:32:03 EST




On 12/9/22 7:20 AM, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 02:50:55PM +0100, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 12:22:37PM +0100, Jiri Olsa wrote:

SBIP


I'm trying to understand the severity of the issues and
whether we need to revert that commit asap since the merge window
is about to start.

Jiri, Peter,

ping.

cc-ing Thorsten, since he's tracking it now.

The config has CONFIG_X86_KERNEL_IBT=y.
Is it related?

sorry for late reply.. I still did not find the reason,
but I did not try with IBT yet, will test now

no difference with IBT enabled, can't reproduce the issue


ok, scratch that.. the reproducer got stuck on wifi init :-\

after I fix that I can now reproduce on my local config with
IBT enabled or disabled.. it's something else

I'm getting the error also when reverting the static call change,
looking for good commit, bisecting

I'm getting fail with:
f0c4d9fc9cc9 (tag: v6.1-rc4) Linux 6.1-rc4

v6.1-rc1 is ok

so far I narrowed it down between rc1 and rc3.. bisect got me nowhere so far

attaching some more logs

looking at the code.. how do we ensure that code running through
bpf_prog_run_xdp will not get dispatcher image changed while
it's being exetuted

we use 'the other half' of the image when we add/remove programs,
but could bpf_dispatcher_update race with bpf_prog_run_xdp like:


cpu 0: cpu 1:

bpf_prog_run_xdp
...
bpf_dispatcher_xdp_func
start exec image at offset 0x0

bpf_dispatcher_update
update image at offset 0x800
bpf_dispatcher_update
update image at offset 0x0

still in image at offset 0x0


that might explain why I wasn't able to trigger that on
bare metal just in qemu

I tried patch below and it fixes the issue for me and seems
to confirm the race above.. but not sure it's the best fix

jirka


---
diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c
index c19719f48ce0..6a2ced102fc7 100644
--- a/kernel/bpf/dispatcher.c
+++ b/kernel/bpf/dispatcher.c
@@ -124,6 +124,7 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs)
}
__BPF_DISPATCHER_UPDATE(d, new ?: (void *)&bpf_dispatcher_nop_func);
+ synchronize_rcu_tasks();
if (new)
d->image_off = noff;

This might work. In arch/x86/kernel/alternative.c, we have following
code and comments. For text_poke, synchronize_rcu_tasks() might be able
to avoid concurrent execution and update.

/**
* text_poke_copy - Copy instructions into (an unused part of) RX memory
* @addr: address to modify
* @opcode: source of the copy
* @len: length to copy, could be more than 2x PAGE_SIZE
*
* Not safe against concurrent execution; useful for JITs to dump
* new code blocks into unused regions of RX memory. Can be used in
* conjunction with synchronize_rcu_tasks() to wait for existing
* execution to quiesce after having made sure no existing functions
* pointers are live.
*/
void *text_poke_copy(void *addr, const void *opcode, size_t len)
{
unsigned long start = (unsigned long)addr;
size_t patched = 0;

if (WARN_ON_ONCE(core_kernel_text(start)))
return NULL;

mutex_lock(&text_mutex);
while (patched < len) {
unsigned long ptr = start + patched;
size_t s;

s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - patched);

__text_poke(text_poke_memcpy, (void *)ptr, opcode + patched, s);
patched += s;
}
mutex_unlock(&text_mutex);
return addr;
}