Re: [PATCH v5] ext4: avoid infinite loops caused by residual data

From: Jan Kara

Date: Fri Mar 06 2026 - 10:26:52 EST

On Fri 06-03-26 09:31:58, Edward Adam Davis wrote:
> On the mkdir/mknod path, when mapping logical blocks to physical blocks,
> if inserting a new extent into the extent tree fails (in this example,
> because the file system disabled the huge file feature when marking the
> inode as dirty), ext4_ext_map_blocks() only calls ext4_free_blocks() to
> reclaim the physical block without deleting the corresponding data in
> the extent tree. This causes subsequent mkdir operations to reference
> the previously reclaimed physical block number again, even though this
> physical block is already being used by the xattr block. Therefore, a
> situation arises where both the directory and xattr are using the same
> buffer head block in memory simultaneously.
>
> The above causes ext4_xattr_block_set() to enter an infinite loop about
> "inserted" and cannot release the inode lock, ultimately leading to the
> 143s blocking problem mentioned in [1].
>
> If the metadata is corrupted, then trying to remove some extent space
> can do even more harm. Also in case EXT4_GET_BLOCKS_DELALLOC_RESERVE
> was passed, remove space wrongly update quota information.
> Jan Kara suggests distinguishing between two cases:
>
> 1) The error is ENOSPC or EDQUOT - in this case the filesystem is fully
> consistent and we must maintain its consistency including all the
> accounting. However these errors can happen only early before we've
> inserted the extent into the extent tree. So current code works correctly
> for this case.
>
> 2) Some other error - this means metadata is corrupted. We should strive to
> do as few modifications as possible to limit damage. So I'd just skip
> freeing of allocated blocks.
>
> [1]
> INFO: task syz.0.17:5995 blocked for more than 143 seconds.
> Call Trace:
> inode_lock_nested include/linux/fs.h:1073 [inline]
> __start_dirop fs/namei.c:2923 [inline]
> start_dirop fs/namei.c:2934 [inline]
>
> Reported-by: syzbot+512459401510e2a9a39f@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=1659aaaaa8d9d11265d7
> Tested-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Reported-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=512459401510e2a9a39f
> Tested-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Edward Adam Davis <eadavis@xxxxxx>

Looks good to me! Feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Honza

> ---
> v1 -> v2: fix ci reported issues
> v2 -> v3: new fix for removing residual data and update subject and coments
> v3 -> v4: filtering already allocated blocks and update comments
> v4 -> v5: don't touch corrupted data and update comments
>
> fs/ext4/extents.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index ae3804f36535..4779da94f816 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4457,9 +4457,13 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> path = ext4_ext_insert_extent(handle, inode, path, &newex, flags);
> if (IS_ERR(path)) {
> err = PTR_ERR(path);
> - if (allocated_clusters) {
> + /*
> + * Gracefully handle out of space conditions. If the filesystem
> + * is inconsistent, we'll just leak allocated blocks to avoid
> + * causing even more damage.
> + */
> + if (allocated_clusters && (err == -EDQUOT || err == -ENOSPC)) {
> int fb_flags = 0;
> -
> /*
> * free data blocks we just allocated.
> * not a good idea to call discard here directly,
> --
> 2.43.0
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR