Re: [PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit

From: Zhihao Cheng
Date: Sat Oct 12 2024 - 08:30:33 EST


在 2024/10/9 22:46, Waqar Hameed 写道:
Running

rm -f /etc/test-file.bin
dd if=/dev/urandom of=/etc/test-file.bin bs=1M count=60 conv=fsync

in a loop, with `CONFIG_UBIFS_FS_AUTHENTICATION`, KASAN reports:

BUG: KASAN: use-after-free in ubifs_tnc_end_commit+0xa5c/0x1950
Write of size 32 at addr ffffff800a3af86c by task ubifs_bgt0_20/153

Call trace:
dump_backtrace+0x0/0x340
show_stack+0x18/0x24
dump_stack_lvl+0x9c/0xbc
print_address_description.constprop.0+0x74/0x2b0
kasan_report+0x1d8/0x1f0
kasan_check_range+0xf8/0x1a0
memcpy+0x84/0xf4
ubifs_tnc_end_commit+0xa5c/0x1950
do_commit+0x4e0/0x1340
ubifs_bg_thread+0x234/0x2e0
kthread+0x36c/0x410
ret_from_fork+0x10/0x20

Allocated by task 401:
kasan_save_stack+0x38/0x70
__kasan_kmalloc+0x8c/0xd0
__kmalloc+0x34c/0x5bc
tnc_insert+0x140/0x16a4
ubifs_tnc_add+0x370/0x52c
ubifs_jnl_write_data+0x5d8/0x870
do_writepage+0x36c/0x510
ubifs_writepage+0x190/0x4dc
__writepage+0x58/0x154
write_cache_pages+0x394/0x830
do_writepages+0x1f0/0x5b0
filemap_fdatawrite_wbc+0x170/0x25c
file_write_and_wait_range+0x140/0x190
ubifs_fsync+0xe8/0x290
vfs_fsync_range+0xc0/0x1e4
do_fsync+0x40/0x90
__arm64_sys_fsync+0x34/0x50
invoke_syscall.constprop.0+0xa8/0x260
do_el0_svc+0xc8/0x1f0
el0_svc+0x34/0x70
el0t_64_sync_handler+0x108/0x114
el0t_64_sync+0x1a4/0x1a8

Freed by task 403:
kasan_save_stack+0x38/0x70
kasan_set_track+0x28/0x40
kasan_set_free_info+0x28/0x4c
__kasan_slab_free+0xd4/0x13c
kfree+0xc4/0x3a0
tnc_delete+0x3f4/0xe40
ubifs_tnc_remove_range+0x368/0x73c
ubifs_tnc_remove_ino+0x29c/0x2e0
ubifs_jnl_delete_inode+0x150/0x260
ubifs_evict_inode+0x1d4/0x2e4
evict+0x1c8/0x450
iput+0x2a0/0x3c4
do_unlinkat+0x2cc/0x490
__arm64_sys_unlinkat+0x90/0x100
invoke_syscall.constprop.0+0xa8/0x260
do_el0_svc+0xc8/0x1f0
el0_svc+0x34/0x70
el0t_64_sync_handler+0x108/0x114
el0t_64_sync+0x1a4/0x1a8

The offending `memcpy` is in `ubifs_copy_hash()`. Fix this by checking
if the `znode` is obsolete before accessing the hash (just like we do
for `znode->parent`).

Do you mean that the UAF occurs in following path:
do_commit -> ubifs_tnc_end_commit -> write_index:
while (1) {
...
znode = cnext;
...
if (znode->cparent)
ubifs_copy_hash(c, hash, znode->cparent->zbranch[znode->ciip].hash); // znode->cparent has been freed!
}

If so, according to the current implementation(lastest linux kernel is v6.12), I cannot understand that how the znode->cparent is freed before write_index() finished, it looks impossible.
All dirty znodes are gathered by function get_znodes_to_commit() which is protected by c->tnc_mutex, and the 'cparent' member in all dirty znodes is assigned with non-NULL. Then I think the znode memory freeing path 'tnc_delete->kfree(znode)' cannot happen, because:
1) If a znode is dirtied, all its' ancestor znodes(a path from znode to root znode) must be dirtied, which is guaranteed by UBIFS. See dirty_cow_bottom_up/lookup_level0_dirty.
2) A dirty znode waiting for commit cannot be freed before write_index() finished, which is guaranteed by tnc_delete:
if (znode->cnext) {
__set_bit(OBSOLETE_ZNODE, &znode->flags);
...
} else {
kfree(znode);
}

Fixes: 16a26b20d2af ("ubifs: authentication: Add hashes to index nodes")
Signed-off-by: Waqar Hameed <waqar.hameed@xxxxxxxx>
---
I'm not entirely sure if this is the _correct_ way to fix this. However,
testing shows that the problem indeed disappears.

My understanding is that the `znode` could concurrently be deleted (with
a reference in an unprotected code section without `tnc_mutex`). The
assumption is that in this case it would be sufficient to check
`ubifs_zn_obsolete(znode)`, like as in the if-statement for
`znode->parent` just below.

I'm analyzing tnc-related code these days, however I can't find places that may concurrently operate the same znode. And I cannot reproduce the problem with your reproducer:
while true; do
rm -f /UBIFS_MNT/test-file.bin
dd if=/dev/urandom of=/UBIFS_MNT/test-file.bin bs=1M count=60 conv=fsync
done

Can you dig more deeper by adding more debug message, so that we can figure out what is really happening.

I'll be happy to get any helpful feedback!

fs/ubifs/tnc_commit.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
index a55e04822d16..0b358254272b 100644
--- a/fs/ubifs/tnc_commit.c
+++ b/fs/ubifs/tnc_commit.c
@@ -891,8 +891,10 @@ static int write_index(struct ubifs_info *c)
mutex_lock(&c->tnc_mutex);
if (znode->cparent)
- ubifs_copy_hash(c, hash,
- znode->cparent->zbranch[znode->ciip].hash);
+ if (!ubifs_zn_obsolete(znode))
+ ubifs_copy_hash(c, hash,
+ znode->cparent->zbranch[znode->ciip]
+ .hash);
if (znode->parent) {
if (!ubifs_zn_obsolete(znode))