Re: [PATCH] bpf: Fix out-of-bounds write in trie_get_next_key()

From: Byeonguk Jeong
Date: Tue Oct 22 2024 - 21:32:20 EST


On Tue, Oct 22, 2024 at 12:51:05PM -0700, Alexei Starovoitov wrote:
> On Mon, Oct 21, 2024 at 6:49 PM Byeonguk Jeong <jungbu2855@xxxxxxxxx> wrote:
> >
> > trie_get_next_key() allocates a node stack with size trie->max_prefixlen,
> > while it writes (trie->max_prefixlen + 1) nodes to the stack when it has
> > full paths from the root to leaves. For example, consider a trie with
> > max_prefixlen is 8, and the nodes with key 0x00/0, 0x00/1, 0x00/2, ...
> > 0x00/8 inserted. Subsequent calls to trie_get_next_key with _key with
> > .prefixlen = 8 make 9 nodes be written on the node stack with size 8.
>
> Hmm. It sounds possible, but pls demonstrate it with a selftest.
> With the amount of fuzzing I'm surprised it was not discovered earlier.
>
> pw-bot: cr

With a simple test below, the kernel crashes in a minute or you can easily
discover the bug on KFENCE-enabled kernels.

#!/bin/bash
bpftool map create /sys/fs/bpf/lpm type lpm_trie key 5 value 1 \
entries 16 flags 0x1name lpm

for i in {0..8}; do
bpftool map update pinned /sys/fs/bpf/lpm \
key hex 0$i 00 00 00 00 \
value hex 00 any
done

while true; do
bpftool map dump pinned /sys/fs/bpf/lpm
done

In my environment (6.12-rc4, with CONFIG_KFENCE), dmesg gave me this
message as expected.

[ 463.141394] BUG: KFENCE: out-of-bounds write in trie_get_next_key+0x2f2/0x670

[ 463.143422] Out-of-bounds write at 0x0000000095bc45ea (256B right of kfence-#156):
[ 463.144438] trie_get_next_key+0x2f2/0x670
[ 463.145439] map_get_next_key+0x261/0x410
[ 463.146444] __sys_bpf+0xad4/0x1170
[ 463.147438] __x64_sys_bpf+0x74/0xc0
[ 463.148431] do_syscall_64+0x79/0x150
[ 463.149425] entry_SYSCALL_64_after_hwframe+0x76/0x7e

[ 463.151436] kfence-#156: 0x00000000279749c1-0x0000000034dc4abb, size=256, cache=kmalloc-256

[ 463.153414] allocated by task 2021 on cpu 2 at 463.140440s (0.012974s ago):
[ 463.154413] trie_get_next_key+0x252/0x670
[ 463.155411] map_get_next_key+0x261/0x410
[ 463.156402] __sys_bpf+0xad4/0x1170
[ 463.157390] __x64_sys_bpf+0x74/0xc0
[ 463.158386] do_syscall_64+0x79/0x150
[ 463.159372] entry_SYSCALL_64_after_hwframe+0x76/0x7e