Re: [PATCH RFC] ubifs: Fix use-after-free in ubifs_tnc_end_commit

From: Zhihao Cheng
Date: Thu Nov 07 2024 - 02:15:10 EST


在 2024/11/7 0:36, Waqar Hameed 写道:
Sorry for the late response Zhihao! I've been quite busy these days...

On Fri, Oct 18, 2024 at 09:40 +0800 Zhihao Cheng <chengzhihao1@xxxxxxxxxx> wrote:

在 2024/10/18 2:36, Waqar Hameed 写道:
On Wed, Oct 16, 2024 at 10:11 +0800 Zhihao Cheng <chengzhihao1@xxxxxxxxxx> wrote:
[...]

BTW, what is the configuration of your flash?(eg. erase size, page size)?
$ mtdinfo /dev/mtd2
mtd2
Name: firmware
Type: nand
Eraseblock size: 131072 bytes, 128.0 KiB
Amount of eraseblocks: 1832 (240123904 bytes, 229.0 MiB)
Minimum input/output unit size: 2048 bytes
Sub-page size: 2048 bytes
OOB size: 64 bytes
Character device major/minor: 90:4
Bad blocks are allowed: true
Device is writable: true
$ ubinfo /dev/ubi0_0
Volume ID: 0 (on ubi0)
Type: dynamic
Alignment: 1
Size: 661 LEBs (83931136 bytes, 80.0 MiB)
State: OK
Name: test-vol
Character device major/minor: 244:1
[...]

Thanks, I will change my nandsim configurations to generate a mtd device the
same model.

Did you manage to reproduce the issue with this?

I tried, but I still cannot reproduce it on my local machine.


Well, let's do a preliminary analysis.
The znode->cparent[znode->ciip] is a freed address in write_index(), which
means:
1. 'znode->ciip' is valid, znode->cparent is freed by tnc_delete, however znode
cannot be freed if znode->cnext is not NULL, which means:
a) 'znode->cparent' is not dirty, we should add an assertion like
ubifs_assert(c, ubifs_zn_dirty(znode->cparent)) in get_znodes_to_commit().
Note, please check that 'znode->cparent' is not NULL before the assertion.
b) 'znode->cparent' is dirty, but it is not added into list 'c->cnext', we
should traverse the entire TNC in get_znodes_to_commit() to make sure that all
dirty znodes are collected into list 'c->cnext', so another assertion is
needed.

I'm a little worried that traversing the whole TNC could change the
timing behavior, and thus might not trigger the race. Let's do that in
steps? Start with the other asserts (see diff below) and later just do
this assert. Does that sound reasonable?

Fine. I add one comment below.

I could modify `dbg_check_tnc()` so that it also checks that each dirty
`znode` is present in `c->cnext` list. We then call this at the end of
`get_znodes_to_commit()`.


Sounds good to me, please remove other non-related checks in dbg_check_tnc().
2. 'znode->ciip' is invalid, and the value beyonds the memory area of
znode->cparent. All znodes are allocated with size of 'c->max_znode_sz', which
means that 'znode->ciip' exceeds the 'c->fantout', so we can add an assertion
like ubifs_assert(c, znode->ciip < c->fantout) in get_znodes_to_commit().

That's what I can think of, are there any other possibilities?
I looked a little more at `get_znodes_to_commit()` when adding the
asserts you suggest, and I have a question: what happens when
`find_next_dirty()` returns `NULL`? In that case
```
znode->cnext = c->cnext;
```
but `znode->cparent` and `znode->ciip` are not updated. Shouldn't they?

Good thinking.
According to the implementation of find_next_dirty(), the order of dirty znodes
collection is bottom-up, which means that the last dirty znode is the root
znode, so it doesn't have a parent. You can verify that by adding assertion to
check whether the last dirty znode is the root.

[...]

To summarize, I'll start a run with the following asserts:

diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c
index a55e04822d16..4eef82e02afe 100644
--- a/fs/ubifs/tnc_commit.c
+++ b/fs/ubifs/tnc_commit.c
@@ -652,11 +652,17 @@ static int get_znodes_to_commit(struct ubifs_info *c)
}
cnt += 1;
while (1) {

Please move the check after the assignment of 'znode->cparent', because 'znode->parent' could be switched by tnc_insert().
+ ubifs_assert(c, znode->ciip < c->fantout);
+ if (znode->cparent) {
+ ubifs_assert(c, ubifs_zn_dirty(znode->cparent));
+ }
+
ubifs_assert(c, !ubifs_zn_cow(znode));
__set_bit(COW_ZNODE, &znode->flags);
znode->alt = 0;
cnext = find_next_dirty(znode);
if (!cnext) {
+ ubifs_assert(c, znode == c->zroot.znode);
znode->cnext = c->cnext;
break;
}


@@ -662,6 +662,10 @@ static int get_znodes_to_commit(struct ubifs_info *c)
}
znode->cparent = znode->parent;
znode->ciip = znode->iip;
+ if (znode->cparent) {
+ ubifs_assert(c, ubifs_zn_dirty(znode->cparent));
+ }
+ ubifs_assert(c, znode->ciip < c->fantout);
znode->cnext = cnext;
znode = cnext;
cnt += 1;

Then later, another run with a modified `dbg_check_tnc()` to check that
all dirty `znode`s are indeed present in the list `c->cnext`.
.