On Wed, Jan 04, 2023 at 09:58:03AM +0800, Jun Nie wrote:
Darrick J. Wong <djwong@xxxxxxxxxx> 于2023年1月4日周三 03:17写道:
On Thu, Dec 29, 2022 at 09:45:02AM +0800, Jun Nie wrote:
For 1k-block filesystems, the filesystem starts at block 1, not block 0.
If start_fsb is 0, it will be bump up to s_first_data_block. Then
ext4_get_group_no_and_offset don't know what to do and return garbage
results (blockgroup 2^32-1). The underflow make index
exceed es->s_groups_count in ext4_get_group_info() and trigger the BUG_ON.
Fixes: 4a4956249dac0 ("ext4: fix off-by-one fsmap error on 1k block filesystems")
Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
Reported-by: syzbot+6be2b977c89f79b6b153@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Jun Nie <jun.nie@xxxxxxxxxx>
---
fs/ext4/fsmap.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
index 4493ef0c715e..1aef127b0634 100644
--- a/fs/ext4/fsmap.c
+++ b/fs/ext4/fsmap.c
@@ -702,6 +702,12 @@ int ext4_getfsmap(struct super_block *sb, struct ext4_fsmap_head *head,
if (handlers[i].gfd_dev > head->fmh_keys[0].fmr_device)
memset(&dkeys[0], 0, sizeof(struct ext4_fsmap));
+ /*
+ * Re-check the range after above limit operation and reject
+ * 1K fs on block 0 as fs should start block 1. */
+ if (dkeys[0].fmr_physical ==0 && dkeys[1].fmr_physical == 0)
+ continue;
...and if this filesystem has 4k blocks, and therefore *does* define a
block 0?
Yes, this is a real corner case test :-)
So I'm really nervous about this change. I don't understand the code;
and I don't understand how the reproducer works. I can certainly
reproduce it using the reproducer found here[1], but it seems to
require running multiple processes all creating loop devices and then
running FS_IOC_GETMAP.
[1] https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
If I change the reproducer to just run the execute_one() once, it
doesn't trigger the bug. It seems to only trigger when you have
multiple processes all racing to create a loop device, mount the file
system, try running FS_IOC_GETMAP --- and then delete the loop device
without actually unmounting the file system. Which is **weird***.
I've tried taking the image, and just running "xfs_io -c fsmap /mnt",
and that doesn't trigger it either.
And I don't understand the reply to Darrick's question about why it's
safe to add the check since for 4k block file systems, block 0 *is*
valid.
So if someone can explain to me what is going on here with this code
(there are too many abstractions and what's going on with keys is just
making my head hurt), *and* what the change actually does, and how to
reproduce the problem with a ***simple*** reproducer -- the syzbot
mess doesn't count, that would be great. But applying a change that I
don't understand to code I don't understand, to fix a reproducer which
I also doesn't understand, just doesn't make me feel comfortable.