Re: [PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit

From: Zhihao Cheng
Date: Tue Oct 15 2024 - 22:11:59 EST


在 2024/10/16 2:52, Waqar Hameed 写道:
Hi Waqar,
[...]

For completeness, here are the _exact_ steps that I have used to
reproduce this on my system with v6.12-rc2 (commit 75b607fab38d "Merge
tag 'sched_ext-for-6.12-rc2-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext"):

```
ubiattach -m 2

keyctl add logon dummy_key: dummy_load @us

ubimkvol /dev/ubi0 -s 80MiB -n 0 -N test-vol
ubiupdatevol /dev/ubi0_0 -t

mount -t ubifs /dev/ubi0_0 /mnt/flash -o auth_hash_name=sha256,auth_key=dummy_key:

count=0
while true; do
date
count=$(($count + 1))
echo count=$count

rm -f /mnt/flash/test-file.bin
dd if=/dev/urandom of=/mnt/flash/test-file.bin bs=1M count=60 conv=fsync

echo ""
done
```


Thanks for the complete program, I will try it again.
BTW, what is the configuration of your flash?(eg. erase size, page size)?

Note that you need to have `CONFIG_UBIFS_FS_AUTHENTICATION=y` (and
`CONFIG_KASAN=y` obviously) in your `.config` in order to trigger the
offending `memcpy()` in `ubifs_copy_hash()`. Also, it takes a while. For
example, last time it took me 88 iterations of the above loop before it
triggered. So you might need to let it spin for a while.

Yes, above two configs are enabled, and I have added printing messages to confirm the authentication path is active.


Can you dig more deeper by adding more debug message, so that we can figure out
what is really happening.

Certainly! I could try to enable the debug prints from UBIFS, however
they are *a lot*. Moreover, printing that much changes the timing
behavior and might make it harder to trigger the use-after-free. Do you
have any tips on where we should try to focus the debug prints (a
dynamic debug filter).


Well, let's do a preliminary analysis.
The znode->cparent[znode->ciip] is a freed address in write_index(), which means:
1. 'znode->ciip' is valid, znode->cparent is freed by tnc_delete, however znode cannot be freed if znode->cnext is not NULL, which means:
a) 'znode->cparent' is not dirty, we should add an assertion like ubifs_assert(c, ubifs_zn_dirty(znode->cparent)) in get_znodes_to_commit(). Note, please check that 'znode->cparent' is not NULL before the assertion.
b) 'znode->cparent' is dirty, but it is not added into list 'c->cnext', we should traverse the entire TNC in get_znodes_to_commit() to make sure that all dirty znodes are collected into list 'c->cnext', so another assertion is needed.
2. 'znode->ciip' is invalid, and the value beyonds the memory area of znode->cparent. All znodes are allocated with size of 'c->max_znode_sz', which means that 'znode->ciip' exceeds the 'c->fantout', so we can add an assertion like ubifs_assert(c, znode->ciip < c->fantout) in get_znodes_to_commit().

That's what I can think of, are there any other possibilities?