riscv32 EXT4 splat, 6.8 regression?

From: Björn Töpel
Date: Fri Apr 12 2024 - 10:57:22 EST


Hi!

I've been looking at an EXT4 splat on riscv32, that LKFT found [1]:

| EXT4-fs (vda): mounted filesystem 13697a42-d10e-4a9e-8e56-cb9083be92f9 ro with ordered data mode. Quota mode: disabled.
| VFS: Mounted root (ext4 filesystem) readonly on device 254:0.
| Unable to handle kernel NULL pointer dereference at virtual address 00000006
| Oops [#1]
| Modules linked in:
| CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.8.0 #41
| Hardware name: riscv-virtio,qemu (DT)
| epc : ext4_search_dir+0x52/0xe4
| ra : __ext4_find_entry+0x1d6/0x578
| epc : c035b60e ra : c035b876 sp : c253fc10
| gp : c21a7380 tp : c25c8000 t0 : 44c0657f
| t1 : 0000000c t2 : 1de5b089 s0 : c253fc50
| s1 : 00000000 a0 : fffffffc a1 : fffff000
| a2 : 00000000 a3 : c29c04f8 a4 : c253fd00
| a5 : 00000000 a6 : c253fcfc a7 : fffffff3
| s2 : 00001000 s3 : 00000000 s4 : 00001000
| s5 : c29c04f8 s6 : c292db40 s7 : c253fcfc
| s8 : fffffff7 s9 : c253fd00 s10: fffff000
| s11: c292db40 t3 : 00000007 t4 : 5e8b4525
| t5 : 00000000 t6 : 00000000
| status: 00000120 badaddr: 00000006 cause: 0000000d
| [<c035b60e>] ext4_search_dir+0x52/0xe4
| [<c035b876>] __ext4_find_entry+0x1d6/0x578
| [<c035bcaa>] ext4_lookup+0x92/0x200
| [<c0295c14>] __lookup_slow+0x8e/0x142
| [<c029943a>] walk_component+0x104/0x174
| [<c0299f18>] path_lookupat+0x78/0x182
| [<c029b24c>] filename_lookup+0x96/0x158
| [<c029b346>] kern_path+0x38/0x56
| [<c0c1bee4>] init_mount+0x46/0x96
| [<c0c2ae1c>] devtmpfs_mount+0x44/0x7a
| [<c0c01c26>] prepare_namespace+0x226/0x27c
| [<c0c01130>] kernel_init_freeable+0x27e/0x2a0
| [<c0b78402>] kernel_init+0x2a/0x158
| [<c0b82bf2>] ret_from_fork+0xe/0x20
| Code: 84ae a809 d303 0044 949a 0f63 0603 991a fd63 0584 (c603) 0064
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

This was not present in 6.7. Bisection wasn't really helpful (to me at
least); I got it down to commit c604110e662a ("Merge tag 'vfs-6.8.misc'
of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs"), and when I
revert the commits in the vfs merge the splat went away, but I *really*
struggle to see how those are related...

What I see in ext4_search_dir() is that search_buf is 0xfffff000, and at
some point the address wraps to zero, and boom. I doubt that 0xfffff000
is a sane address.

Maybe this is something the the fs folks can spot directly? In the
meantime I'll continue to dig...


Thanks, and have a nice weeked!
Björn


[1] https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.8.y/build/v68.4-281-g6d08df6c401e/testrun/23369914/suite/log-parser-test/tests/