Re: [PATCH] ext4: don't order data when zeroing unwritten or delayed block

From: Jan Kara

Date: Mon Dec 22 2025 - 05:48:20 EST


On Mon 22-12-25 09:31:36, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
>
> When zeroing out a written partial block, it is necessary to order the
> data to prevent exposing stale data on disk. However, if the buffer is
> unwritten or delayed, it is not allocated as written, so ordering the
> data is not required. This can prevent strange and unnecessary ordered
> writes when appending data across a region within a block.
>
> Assume we have a 2K unwritten file on a filesystem with 4K blocksize,
> and buffered write from 3K to 4K. Before this patch,
> __ext4_block_zero_page_range() would add the range [2k,3k) to the
> ordered range, and then the JBD2 commit process would write back this
> block. However, it does nothing since the block is not mapped, this
^^^ by this you
mean that the block is unwritten, don't you?

> folio will be redirtied and written back agian through the normal write
> back process.
>
> Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx>

The patch looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Honza

> ---
> fs/ext4/inode.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index fa579e857baf..fc16a89903b9 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4104,9 +4104,13 @@ static int __ext4_block_zero_page_range(handle_t *handle,
> if (ext4_should_journal_data(inode)) {
> err = ext4_dirty_journalled_data(handle, bh);
> } else {
> - err = 0;
> mark_buffer_dirty(bh);
> - if (ext4_should_order_data(inode))
> + /*
> + * Only the written block requires ordered data to prevent
> + * exposing stale data.
> + */
> + if (!buffer_unwritten(bh) && !buffer_delay(bh) &&
> + ext4_should_order_data(inode))
> err = ext4_jbd2_inode_add_write(handle, inode, from,
> length);
> }
> --
> 2.52.0
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR