Re: [PATCH v2] ext4: don't order data when zeroing unwritten or delayed block

From: Baokun Li

Date: Wed Dec 31 2025 - 01:40:06 EST


On 2025-12-23 09:19, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
>
> When zeroing out a written partial block, it is necessary to order the
> data to prevent exposing stale data on disk. However, if the buffer is
> unwritten or delayed, it is not allocated as written, so ordering the
> data is not required. This can prevent strange and unnecessary ordered
> writes when appending data across a region within a block.
>
> Assume we have a 2K unwritten file on a filesystem with 4K blocksize,
> and buffered write from 3K to 4K. Before this patch,
> __ext4_block_zero_page_range() would add the range [2k,3k) to the
> ordered range, and then the JBD2 commit process would write back this
> block. However, it does nothing since the block is not mapped as
> written, this folio will be redirtied and written back agian through the
> normal write back process.
>
> Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx>
> Reviewed-by: Jan Kara <jack@xxxxxxx>

Makes sense. Feel free to add:

Reviewed-by: Baokun Li <libaokun1@xxxxxxxxxx>

> ---
> fs/ext4/inode.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 2e79b09fe2f0..f2d70c9af446 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4109,9 +4109,13 @@ static int __ext4_block_zero_page_range(handle_t *handle,
> if (ext4_should_journal_data(inode)) {
> err = ext4_dirty_journalled_data(handle, bh);
> } else {
> - err = 0;
> mark_buffer_dirty(bh);
> - if (ext4_should_order_data(inode))
> + /*
> + * Only the written block requires ordered data to prevent
> + * exposing stale data.
> + */
> + if (!buffer_unwritten(bh) && !buffer_delay(bh) &&
> + ext4_should_order_data(inode))
> err = ext4_jbd2_inode_add_write(handle, inode, from,
> length);
> }