Re: [PATCH v3] ext4: avoid infinite loops caused by residual data

From: Edward Adam Davis

Date: Thu Mar 05 2026 - 07:19:09 EST

On Thu, 5 Mar 2026 11:38:00 +0100 Jan Kara wrote:
> > On the mkdir/mknod path, when mapping logical blocks to physical blocks,
> > if inserting a new extent into the extent tree fails (in this example,
> > because the file system disabled the huge file feature when marking the
> > inode as dirty),
>
> I don't quite understand what you mean here but I think you say that
> ext4_ext_dirty() -> ext4_mark_inode_dirty() returns error due to whatever
> corruption it has hit.
It returns -EFSCORRUPTED in following calltrace:
ext4_ext_dirty()->
ext4_mark_inode_dirty()->
__ext4_mark_inode_dirty()->
ext4_mark_iloc_dirty()->
ext4_do_update_inode()->
ext4_fill_raw_inode()->
ext4_inode_blocks_set()->
if (!ext4_has_feature_huge_file(sb))
return -EFSCORRUPTED;
>
> > ext4_ext_map_blocks() only calls ext4_free_blocks() to
> > reclaim the physical block without deleting the corresponding data in
> > the extent tree. This causes subsequent mkdir operations to reference
> > the previously reclaimed physical block number again, even though this
> > physical block is already being used by the xattr block. Therefore, a
> > situation arises where both the directory and xattr are using the same
> > buffer head block in memory simultaneously.
>
> OK, this indeed looks like "not so great" error handling. Thanks for
> digging into this.
>
> > The above causes ext4_xattr_block_set() to enter an infinite loop about
> > "inserted" and cannot release the inode lock, ultimately leading to the
> > 143s blocking problem mentioned in [1].
> >
> > By using ext4_ext_remove_space() to delete the inserted logical block
> > and reclaim the physical block when inserting a new extent fails during
> > extent block mapping, residual extent data can be prevented from affecting
> > subsequent logical block physical mappings.
> >
> > [1]
> > INFO: task syz.0.17:5995 blocked for more than 143 seconds.
> > Call Trace:
> > inode_lock_nested include/linux/fs.h:1073 [inline]
> > __start_dirop fs/namei.c:2923 [inline]
> > start_dirop fs/namei.c:2934 [inline]
> >
> > Reported-by: syzbot+512459401510e2a9a39f@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Closes: https://syzkaller.appspot.com/bug?extid=1659aaaaa8d9d11265d7
> > Tested-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Reported-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Closes: https://syzkaller.appspot.com/bug?extid=512459401510e2a9a39f
> > Tested-by: syzbot+1659aaaaa8d9d11265d7@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Signed-off-by: Edward Adam Davis <eadavis@xxxxxx>
> > ---
> ...
> > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> > index ae3804f36535..0bed3379f2d2 100644
> > --- a/fs/ext4/extents.c
> > +++ b/fs/ext4/extents.c
> > @@ -4458,19 +4458,13 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
> > if (IS_ERR(path)) {
> > err = PTR_ERR(path);
> > if (allocated_clusters) {
> > - int fb_flags = 0;
> > -
> > /*
> > * free data blocks we just allocated.
> > * not a good idea to call discard here directly,
> > * but otherwise we'd need to call it every free().
> > */
> > ext4_discard_preallocations(inode);
> > - if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
> > - fb_flags = EXT4_FREE_BLOCKS_NO_QUOT_UPDATE;
> > - ext4_free_blocks(handle, inode, NULL, newblock,
> > - EXT4_C2B(sbi, allocated_clusters),
> > - fb_flags);
> > + ext4_ext_remove_space(inode, newex.ee_block, newex.ee_block);
>
> So I'm concerned that if the metadata is corrupted, then trying to remove
> some extent space can do even more harm. Also in case
> EXT4_GET_BLOCKS_DELALLOC_RESERVE was passed, we now wrongly update quota
> information. So this definitely isn't a correct fix. What I'd do instead
> would be distinguishing two cases:
>
> 1) The error is ENOSPC or EDQUOT - in this case the filesystem is fully
> consistent and we must maintain its consistency including all the
> accounting. However these errors can happen only early before we've
> inserted the extent into the extent tree. So current code works correctly
> for this case.
>
> 2) Some other error - this means metadata is corrupted. We should strive to
> do as few modifications as possible to limit damage. So I'd just skip
> freeing of allocated blocks.
>
> Long story short I think we should just modify the above condition:
>
> if (allocated_clusters)
>
> to
>
> /*
> * Gracefully handle out of space conditions. If the filesystem is
> * inconsistent, we'll just leak allocated blocks to avoid causing
> * even more damage.
> */
> if (allocated_clusters && (err == -EDQUOT || err == -ENOSPC))
I have tested the modified code, and now it no longer deletes any data
for the -EFSCORRUPTED error (before applying my patch, it only deleted
the physical block and not the corresponding data in the extent tree).
This also prevents conflicts caused by data being reused by dir and
xattr.

I will release a new patch later to add the filtering you requested.

BR,
Edward