[PATCH 0/1] Kernel oopses on ext2 filesystems after 782b76d7abdf

From: Javier Pello
Date: Tue Jul 13 2021 - 10:58:28 EST


From: Javier Pello <javier.pello@xxxxxxx>

The current mainline kernel oopses when handling ext2 filesystems,
and the filesystem is not usable, sometimes leading to a panic.

I recently tried to upgrade the kernel on my system from 5.10.7 to
5.13.1 but, when I booted the new kernel, I started getting oopses
consistently during the boot process as the init script tried to
mount an ext2 filesystem, and other weird behaviour (like tasks
stuck in uninterruptible sleep) if I accessed the filesystem later
(which I did not do for long for fear of data loss). I managed to
set up a QEMU virtual machine with as small a reproducer as I could
get (kernel stripped of as much stuff as possible) and this is the
oops message that I got (I set the kernel to panic on oops to that
I could get the first report; otherwise, the kernel tries to go on
but ends up attempting to kill init):

BUG: kernel NULL pointer dereference, address: 00000004
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
*pde = 00000000
Oops: 0000 [#1]
CPU: 0 PID: 41 Comm: init.boot Not tainted 5.12.0-rc3+ #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
EIP: ext2_get_page.isra.0+0xe2/0x2b0
Code: 8b 45 dc 89 5d cc 8b 80 cc 01 00 00 8b 40 34 8b 00 89 45 d4 8b 45 e0 f7 d8 89 45 e0 8b 45 e8 83 e8 0c 89 45 d0 8b 45 f0 01 f8 <0f> b7 48 04 89 ca 66 83 f9 0b 76 4a 83 e2 03 0f 85 09 01 00 00 0f
EAX: 00000000 EBX: f73fa700 ECX: 00000000 EDX: 00000001
ESI: 00000000 EDI: 00000000 EBP: c2509ce8 ESP: c2509cb0
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010246
CR0: 80050033 CR2: 00000004 CR3: 024dd000 CR4: 00150e90
Call Trace:
ext2_find_entry+0x79/0x240
ext2_inode_by_name+0x16/0x70
ext2_lookup+0x27/0x70
__lookup_slow+0x4f/0xe0
walk_component+0xf7/0x160
link_path_walk.part.0+0x24d/0x350
? terminate_walk+0x7d/0xf0
path_lookupat+0x39/0x180
filename_lookup+0x78/0x130
? kmem_cache_alloc+0x21/0x130
? getname_flags+0x1f/0x160
? getname_flags+0x36/0x160
user_path_at_empty+0x25/0x30
vfs_statx+0x53/0xd0
__do_sys_stat64+0x27/0x50
__ia32_sys_stat64+0xd/0x10
__do_fast_syscall_32+0x40/0x70
do_fast_syscall_32+0x28/0x60
do_SYSENTER_32+0x15/0x20
entry_SYSENTER_32+0x98/0xe6
EIP: 0xb7ee8549
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
EAX: ffffffda EBX: 0a0dd360 ECX: bf993cc0 EDX: b7e64000
ESI: 0a0dd360 EDI: 0a0dd000 EBP: 0a0dd340 ESP: bf993c98
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296
CR2: 0000000000000004
---[ end trace 865a3cf144c3889e ]---

I am running a 32-bit kernel on an x86 machine, in case this is
relevant. Also, I noticed that the high memory support config
option must be set to 4GB to trigger the bug; I could not
reproduce the bug if I set high memory to off.

As the text says, the oops is triggered within ext2_get_page.
I managed to track it down to line 130 in fs/ext2/dir.c, within
ext2_check_page,
rec_len = ext2_rec_len_from_disk(p->rec_len);
Adding a printk in the function I confirmed that, while variable
page has a non-null value on function entry, the assignment
char *kaddr = page_address(page);
in line 114 sets kaddr to a null value indeed, and this triggers
the bug when dereferenced later.

I bisected the problem to commit

782b76d7abdf02b12c46ed6f1e9bf715569027f7
fs/ext2: Replace kmap() with kmap_local_page()

The oops triggers consistently on this commit but its parent
commit works fine. Analysing the commit, I think that it may be
incomplete, as ext2_check_page and ext2_delete_entry are still
using page_address to get at the page address, but this no longer
works because those pages are now mapped with kmap_local_page,
not kmap. It seems to me that ext2_check_page and ext2_delete_entry
should be provided with the page address that their callers already
have for the page, as with all other functions in fs/ext2/dir.c
now. The proposed patch does precisely this.

Javier Pello (1):
fs/ext2: Avoid page_address on pages returned by ext2_get_page

fs/ext2/dir.c | 20 ++++++++++----------
fs/ext2/ext2.h | 3 ++-
fs/ext2/namei.c | 4 ++--
3 files changed, 14 insertions(+), 13 deletions(-)

--
2.30.1