[RFC] ext4: possible inconsistency in ext4_append() error path

From: Vineet Agarwal

Date: Fri May 01 2026 - 13:25:45 EST


Hi,

While looking into ext4 directory operations, I noticed a possible
inconsistency in the error handling of ext4_append().

In ext4_append(), the inode size is updated before all failure points
have been ruled out:

bh = ext4_bread(handle, inode, *block, EXT4_GET_BLOCKS_CREATE);
if (IS_ERR(bh))
return bh;

inode->i_size += inode->i_sb->s_blocksize;
EXT4_I(inode)->i_disksize = inode->i_size;

err = ext4_mark_inode_dirty(handle, inode);
if (err)
goto out;

err = ext4_journal_get_write_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (err)
goto out;

If either ext4_mark_inode_dirty() or
ext4_journal_get_write_access() fails, the function returns an
error but does not restore the original inode size.

Callers of ext4_append() appear to treat it as an all-or-nothing
operation:

bh = ext4_append(handle, dir, &block);
if (IS_ERR(bh))
goto out;

However, in the failure case, inode->i_size may already have been
increased.

One possible consequence is that subsequent checks relying on i_size,
such as:

if (block >= inode->i_size >> inode->i_blkbits)

may allow a block index to pass bounds checks even though the append
operation did not complete successfully.

I understand that journaling may ensure on-disk consistency, but the
in-memory inode state may still temporarily reflect a change that did
not logically succeed.

Is this behavior intentional, or should ext4_append() avoid updating
i_size until after all failure points, or roll it back on error?

Thanks,
Vineet Agarwal