[RFC] ext4: possible inconsistency in ext4_append() error path

Next message: Steven Rostedt: "Re: [PATCH v6] tracing: Bound synthetic-field strings with seq_buf"
Previous message: Andrew Lunn: "Re: [PATCH net-next 05/12] net: stmmac: dwxgmac2: Add multi MSI interrupt mode"
Next in thread: Jan Kara: "Re: [RFC] ext4: possible inconsistency in ext4_append() error path"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Vineet Agarwal

Date: Fri May 01 2026 - 13:25:45 EST

Hi,

While looking into ext4 directory operations, I noticed a possible
inconsistency in the error handling of ext4_append().

In ext4_append(), the inode size is updated before all failure points
have been ruled out:

bh = ext4_bread(handle, inode, *block, EXT4_GET_BLOCKS_CREATE);
if (IS_ERR(bh))
return bh;

inode->i_size += inode->i_sb->s_blocksize;
EXT4_I(inode)->i_disksize = inode->i_size;

err = ext4_mark_inode_dirty(handle, inode);
if (err)
goto out;

err = ext4_journal_get_write_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (err)
goto out;

If either ext4_mark_inode_dirty() or
ext4_journal_get_write_access() fails, the function returns an
error but does not restore the original inode size.

Callers of ext4_append() appear to treat it as an all-or-nothing
operation:

bh = ext4_append(handle, dir, &block);
if (IS_ERR(bh))
goto out;

However, in the failure case, inode->i_size may already have been
increased.

One possible consequence is that subsequent checks relying on i_size,
such as:

if (block >= inode->i_size >> inode->i_blkbits)

may allow a block index to pass bounds checks even though the append
operation did not complete successfully.

I understand that journaling may ensure on-disk consistency, but the
in-memory inode state may still temporarily reflect a change that did
not logically succeed.

Is this behavior intentional, or should ext4_append() avoid updating
i_size until after all failure points, or roll it back on error?

Thanks,
Vineet Agarwal