Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest

From: Björn Töpel
Date: Thu Dec 19 2024 - 04:17:23 EST


Björn Töpel <bjorn@xxxxxxxxxx> writes:

> Levi Zim <rsworktech@xxxxxxxxxxx> writes:
>
>> On 2024-12-04 09:01, Cong Wang wrote:
>>> On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
>>>> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
>>>>> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
>>>>> test_sockmap.c triggers a kernel NULL pointer dereference:
>>> Interesting, I also ran this test recently and I didn't see such a
>>> crash.
>>
>> I am also curious about why other people or the CI didn't hit such crash.
>
> FWIW, I'm hitting it on RISC-V:
>
> | Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000008
> | Oops [#1]
> | Modules linked in: sch_fq_codel drm fuse drm_panel_orientation_quirks backlight
> | CPU: 7 UID: 0 PID: 732 Comm: test_sockmap Not tainted 6.13.0-rc3-00017-gf44d154d6e3d #1
> | Hardware name: riscv-virtio qemu/qemu, BIOS 2025.01-rc3-00042-gacab6e78aca7 01/01/2025
> | epc : splice_to_socket+0x376/0x49a
> | ra : splice_to_socket+0x37c/0x49a
> | epc : ffffffff803d9ffc ra : ffffffff803da002 sp : ff20000001c3b8b0
> | gp : ffffffff827aefa8 tp : ff60000083450040 t0 : ff6000008a12d001
> | t1 : 0000100100001001 t2 : 0000000000000000 s0 : ff20000001c3bae0
> | s1 : ffffffffffffefff a0 : ff6000008245e200 a1 : ff60000087dd0450
> | a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
> | a5 : 0000000000000000 a6 : ff20000001c3b450 a7 : ff6000008a12c004
> | s2 : 000000000000000f s3 : ff6000008245e2d0 s4 : ff6000008245e280
> | s5 : 0000000000000000 s6 : 0000000000000002 s7 : 0000000000001001
> | s8 : 0000000000003001 s9 : 0000000000000002 s10: 0000000000000002
> | s11: ff6000008245e200 t3 : ffffffff8001e78c t4 : 0000000000000000
> | t5 : 0000000000000000 t6 : ff6000008869f230
> | status: 0000000200000120 badaddr: 0000000000000008 cause: 000000000000000d
> | [<ffffffff803d9ffc>] splice_to_socket+0x376/0x49a
> | [<ffffffff803d8bc0>] direct_splice_actor+0x44/0x216
> | [<ffffffff803d8532>] splice_direct_to_actor+0xb6/0x1e8
> | [<ffffffff803d8780>] do_splice_direct+0x70/0xa2
> | [<ffffffff80392e40>] do_sendfile+0x26e/0x2d4
> | [<ffffffff803939d4>] __riscv_sys_sendfile64+0xf2/0x10e
> | [<ffffffff80fdfb64>] do_trap_ecall_u+0x1f8/0x26c
> | [<ffffffff80fedaee>] _new_vmalloc_restore_context_a0+0xc6/0xd2
> | Code: c5d8 9e35 c590 8bb3 40db eb01 6998 b823 0005 856e (6718) 2d05
> | ---[ end trace 0000000000000000 ]---
> | Kernel panic - not syncing: Fatal exception
> | SMP: stopping secondary CPUs
> | ---[ end Kernel panic - not syncing: Fatal exception ]---
>
> This is commit f44d154d6e3d ("Merge tag 'soc-fixes-6.13' of
> git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc").
>
> (Yet to bisect!)

Took the series for a run, and it does solve crash, but I'm getting
additional failures:

| [TEST 298]: (512, 1, 3, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Invalid argument
| rx thread exited with err 1.
| FAILED
| [TEST 299]: (100, 1, 5, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Invalid argument
| rx thread exited with err 1.
| FAILED
| [TEST 300]: (2, 32, 8192, sendpage, pass,pop (4096,8192),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Bad message
| rx thread exited with err 1.
| FAILED
| ...
| #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
| ...
| [TEST 308]: (2, 32, 8192, sendpage, pass,pop (5,21),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Bad message
| rx thread exited with err 1.
| FAILED
| [TEST 309]: (2, 32, 8192, sendpage, pass,pop (1,11),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Bad message
| rx thread exited with err 1.
| FAILED
| ...
| #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL