[REPORT] BPF: Reproducible triggering of BUG() from userspace PoC

From: Lee Jones
Date: Wed Nov 08 2023 - 10:46:38 EST


Good afternoon,

After coming across a recent Syzkaller report [0] I thought I'd take
some time to firstly reproduce the issue, then see if there was a
trivial way to mitigate it. The report suggests that a BUG() in
prog_array_map_poke_run() [1] can be trivially and reliably triggered
from userspace using the PoC provided [2].

ret = bpf_arch_text_poke(poke->tailcall_bypass,
BPF_MOD_JUMP,
old_bypass_addr,
poke->bypass_addr);
BUG_ON(ret < 0 && ret != -EINVAL);

Indeed the PoC does seem to be able to consistently trigger the BUG(),
not only on the reported kernel (v6.1), but also on linux-next. I went
to the trouble of checking LORE, but failed to find any patches which
may be attempting to fix this.

kernel BUG at kernel/bpf/arraymap.c:1094!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 5 PID: 45 Comm: kworker/5:0 Not tainted 6.6.0-rc3-next-20230929-dirty #74
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: events prog_array_map_clear_deferred
RIP: 0010:prog_array_map_poke_run+0x6b4/0x6d0
Code: ff 0f 0b e8 1e 27 e1 ff 48 c7 c7 60 80 93 85 48 c7 c6 00 7f 93 85 48 c7 c2 bb c2 39 86 b9 45 04 00 00 45 89 f8 e8 9c 890
RSP: 0018:ffffc9000036fb50 EFLAGS: 00010246
RAX: 0000000000000044 RBX: ffff88811f337490 RCX: 63af48a1314f9900
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffffc9000036fbe8 R08: ffffffff815c23c5 R09: 1ffff11084c14eba
R10: dfffe91084c14ebc R11: ffffed1084c14ebb R12: ffff888116517800
R13: dffffc0000000000 R14: ffff888125a1a400 R15: 00000000fffffff0
FS: 0000000000000000(0000) GS:ffff888426080000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004ab678 CR3: 0000000122ac4000 CR4: 0000000000350eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die_body+0x92/0xf0
? die+0xa2/0xe0
? do_trap+0x12f/0x370
? handle_invalid_op+0xa6/0x140
? handle_invalid_op+0xdf/0x140
? prog_array_map_poke_run+0x6b4/0x6d0
? prog_array_map_poke_run+0x6b4/0x6d0
? exc_invalid_op+0x32/0x50
? asm_exc_invalid_op+0x1b/0x20
? __wake_up_klogd+0xd5/0x110
? prog_array_map_poke_run+0x6b4/0x6d0
? bpf_prog_6781ebc2dae4bad9+0xb/0x53
fd_array_map_delete_elem+0x152/0x250
prog_array_map_clear_deferred+0xf6/0x210
? __bpf_array_map_seq_show+0xa40/0xa40
? kick_pool+0x164/0x350
? process_one_work+0x57a/0xd00
process_one_work+0x5e4/0xd00
worker_thread+0x9cf/0xea0
kthread+0x2b4/0x350
? pr_cont_work+0x580/0x580
? kthread_blkcg+0xd0/0xd0
ret_from_fork+0x4a/0x80
? kthread_blkcg+0xd0/0xd0
ret_from_fork_asm+0x11/0x20
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---

However, with my very limited BPF subsystem knowledge I was unable to
trivially fix the issue. Hopefully some knowledgable person would be
kind enough to provide me with some pointers.

bpf_arch_text_poke() seems to be returning -EBUSY due to a negative
memcmp() result from [3].

ret = -EBUSY;
mutex_lock(&text_mutex);
if (memcmp(ip, old_insn, X86_PATCH_SIZE)) {
goto out;
[...]

When spitting out the memory at those locations, this is the result:

ip: e9 06 00 00 00
old_insn: 0f 1f 44 00 00
nop_insn: 0f 1f 44 00 00

As you can see, the information stored in 'ip' does not match that of
the data stored in 'old_insn', causing bpf_arch_text_poke() to return
early with the error -EBUSY, suggesting that the data pointed to by
'old_insn', and by extension 'prog' should have been changed when
emit_call()ing, to the value of 'ip', but wasn't.

It's possible for me to see what is happening, but I'm afraid finding
possible causes of corruption became too time consuming on this
occasion. Would anyone be able to chime in to provide their take on
possible causes please?

Any help would be gratefully received.

[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
[1] https://elixir.bootlin.com/linux/latest/source/kernel/bpf/arraymap.c#L1092
[2] https://syzkaller.appspot.com/text?tag=ReproC&x=1397180f680000
[3] https://elixir.bootlin.com/linux/latest/source/arch/x86/net/bpf_jit_comp.c#L387

--
Lee Jones [李琼斯]