Re: [PATCH v3] ext4: avoid infinite loops caused by residual data
From: Jan Kara
Date: Thu Mar 05 2026 - 05:39:54 EST
On Thu 05-03-26 15:25:46, Edward Adam Davis wrote:
> On the mkdir/mknod path, when mapping logical blocks to physical blocks,
> if inserting a new extent into the extent tree fails (in this example,
> because the file system disabled the huge file feature when marking the
> inode as dirty),
I don't quite understand what you mean here but I think you say that
ext4_ext_dirty() -> ext4_mark_inode_dirty() returns error due to whatever
corruption it has hit.
> ext4_ext_map_blocks() only calls ext4_free_blocks() to
> reclaim the physical block without deleting the corresponding data in
> the extent tree. This causes subsequent mkdir operations to reference
> the previously reclaimed physical block number again, even though this
> physical block is already being used by the xattr block. Therefore, a
> situation arises where both the directory and xattr are using the same
> buffer head block in memory simultaneously.
OK, this indeed looks like "not so great" error handling. Thanks for
digging into this.
> The above causes ext4_xattr_block_set() to enter an infinite loop about
> "inserted" and cannot release the inode lock, ultimately leading to the
> 143s blocking problem mentioned in [1].
>
> By using ext4_ext_remove_space() to delete the inserted logical block
> and reclaim the physical block when inserting a new extent fails during
> extent block mapping, residual extent data can be prevented from affecting
> subsequent logical block physical mappings.
>
> [1]
> INFO: task syz.0.17:5995 blocked for more than 143 seconds.
> Call Trace:
> inode_lock_nested include/linux/fs.h:1073 [inline]
> __start_dirop fs/namei.c:2923 [inline]
> start_dirop fs/namei.c:2934 [inline]
>
> Reported-by: syzbot+512459401510e2a9a39f@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=1659aaaaa8d9d11265d7
> Tested-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=512459401510e2a9a39f
> Tested-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Edward Adam Davis <eadavis@xxxxxx>
> ---
...
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index ae3804f36535..0bed3379f2d2 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4458,19 +4458,13 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> if (IS_ERR(path)) {
> err = PTR_ERR(path);
> if (allocated_clusters) {
> - int fb_flags = 0;
> -
> /*
> * free data blocks we just allocated.
> * not a good idea to call discard here directly,
> * but otherwise we'd need to call it every free().
> */
> ext4_discard_preallocations(inode);
> - if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
> - fb_flags = EXT4_FREE_BLOCKS_NO_QUOT_UPDATE;
> - ext4_free_blocks(handle, inode, NULL, newblock,
> - EXT4_C2B(sbi, allocated_clusters),
> - fb_flags);
> + ext4_ext_remove_space(inode, newex.ee_block, newex.ee_block);
So I'm concerned that if the metadata is corrupted, then trying to remove
some extent space can do even more harm. Also in case
EXT4_GET_BLOCKS_DELALLOC_RESERVE was passed, we now wrongly update quota
information. So this definitely isn't a correct fix. What I'd do instead
would be distinguishing two cases:
1) The error is ENOSPC or EDQUOT - in this case the filesystem is fully
consistent and we must maintain its consistency including all the
accounting. However these errors can happen only early before we've
inserted the extent into the extent tree. So current code works correctly
for this case.
2) Some other error - this means metadata is corrupted. We should strive to
do as few modifications as possible to limit damage. So I'd just skip
freeing of allocated blocks.
Long story short I think we should just modify the above condition:
if (allocated_clusters)
to
/*
* Gracefully handle out of space conditions. If the filesystem is
* inconsistent, we'll just leak allocated blocks to avoid causing
* even more damage.
*/
if (allocated_clusters && (err == -EDQUOT || err == -ENOSPC))
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR