Re: riscv32 EXT4 splat, 6.8 regression?

From: Nam Cao
Date: Sat Apr 13 2024 - 10:43:46 EST


On 2024-04-12 Björn Töpel wrote:
> Hi!
>
> I've been looking at an EXT4 splat on riscv32, that LKFT found [1]:
>
> | EXT4-fs (vda): mounted filesystem 13697a42-d10e-4a9e-8e56-cb9083be92f9 ro with ordered data mode. Quota mode: disabled.
> | VFS: Mounted root (ext4 filesystem) readonly on device 254:0.
> | Unable to handle kernel NULL pointer dereference at virtual address 00000006
> | Oops [#1]
> | Modules linked in:
> | CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.8.0 #41
> | Hardware name: riscv-virtio,qemu (DT)
> | epc : ext4_search_dir+0x52/0xe4
> | ra : __ext4_find_entry+0x1d6/0x578
> | epc : c035b60e ra : c035b876 sp : c253fc10
> | gp : c21a7380 tp : c25c8000 t0 : 44c0657f
> | t1 : 0000000c t2 : 1de5b089 s0 : c253fc50
> | s1 : 00000000 a0 : fffffffc a1 : fffff000
> | a2 : 00000000 a3 : c29c04f8 a4 : c253fd00
> | a5 : 00000000 a6 : c253fcfc a7 : fffffff3
> | s2 : 00001000 s3 : 00000000 s4 : 00001000
> | s5 : c29c04f8 s6 : c292db40 s7 : c253fcfc
> | s8 : fffffff7 s9 : c253fd00 s10: fffff000
> | s11: c292db40 t3 : 00000007 t4 : 5e8b4525
> | t5 : 00000000 t6 : 00000000
> | status: 00000120 badaddr: 00000006 cause: 0000000d
> | [<c035b60e>] ext4_search_dir+0x52/0xe4
> | [<c035b876>] __ext4_find_entry+0x1d6/0x578
> | [<c035bcaa>] ext4_lookup+0x92/0x200
> | [<c0295c14>] __lookup_slow+0x8e/0x142
> | [<c029943a>] walk_component+0x104/0x174
> | [<c0299f18>] path_lookupat+0x78/0x182
> | [<c029b24c>] filename_lookup+0x96/0x158
> | [<c029b346>] kern_path+0x38/0x56
> | [<c0c1bee4>] init_mount+0x46/0x96
> | [<c0c2ae1c>] devtmpfs_mount+0x44/0x7a
> | [<c0c01c26>] prepare_namespace+0x226/0x27c
> | [<c0c01130>] kernel_init_freeable+0x27e/0x2a0
> | [<c0b78402>] kernel_init+0x2a/0x158
> | [<c0b82bf2>] ret_from_fork+0xe/0x20
> | Code: 84ae a809 d303 0044 949a 0f63 0603 991a fd63 0584 (c603) 0064
> | ---[ end trace 0000000000000000 ]---
> | Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> This was not present in 6.7. Bisection wasn't really helpful (to me at
> least); I got it down to commit c604110e662a ("Merge tag 'vfs-6.8.misc'
> of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs"), and when I
> revert the commits in the vfs merge the splat went away, but I *really*
> struggle to see how those are related...
>
> What I see in ext4_search_dir() is that search_buf is 0xfffff000, and at
> some point the address wraps to zero, and boom. I doubt that 0xfffff000
> is a sane address.

I have zero knowledge about file system, but I think it's an integer
overflow problem. The calculation of "dlimit" overflow and dlimit wraps
around, this leads to wrong comparison later on.

I guess that explains why your bisect and Conor's bisect results are
strange: the bug has been here for quite some time, but it only appears
when "dlimit" happens to overflow.

It can be fixed by re-arrange the comparisons a bit. Can you give the
below patch a try?

Best regards,
Nam

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 05b647e6bc19..71b88b33b676 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1532,15 +1532,13 @@ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size,
unsigned int offset, struct ext4_dir_entry_2 **res_dir)
{
struct ext4_dir_entry_2 * de;
- char * dlimit;
int de_len;

de = (struct ext4_dir_entry_2 *)search_buf;
- dlimit = search_buf + buf_size;
- while ((char *) de < dlimit - EXT4_BASE_DIR_LEN) {
+ while ((char *) de - search_buf < buf_size - EXT4_BASE_DIR_LEN) {
/* this code is executed quadratically often */
/* do minimal checking `by hand' */
- if (de->name + de->name_len <= dlimit &&
+ if (de->name + de->name_len - search_buf <= buf_size &&
ext4_match(dir, fname, de)) {
/* found a match - just to be sure, do
* a full check */