Re: [PATCH v2 1/2] fs/buffer: avoid tail commit walk for uptodate folios

From: Jan Kara

Date: Mon Jun 08 2026 - 09:09:57 EST


On Mon 08-06-26 20:01:30, Jia Zhu wrote:
> block_commit_write() always walks every buffer_head attached to the
> folio. That was cheap for order-0 folios, but large folios can contain
> hundreds of buffer_heads. For a small buffered overwrite of an
> already-uptodate large folio, the commit work is therefore proportional
> to the folio size rather than the copied range.
>
> This became visible with ext4 regular-file large folios, where cached
> small overwrites reach block_commit_write() through block_write_end().
> Before ext4 enabled large folios for regular files, this path was only
> hit with order-0 folios for normal ext4 buffered writes, so the full walk
> was bounded. The ext4 large-folio commit is therefore the regression
> point for this generic helper cost.
>
> The full walk is still needed when the folio is not uptodate, because
> block_commit_write() uses per-buffer uptodate state to decide whether
> the whole folio can be marked uptodate. Keep those folios on the old
> full-buffer path.
>
> For a folio that was already uptodate on entry, the commit no longer
> needs tail buffers for folio-uptodate discovery. The copied range has
> already been processed once block_start reaches @to, so stop there and
> avoid the suffix walk.
>
> Fixes: 7ac67301e82f0 ("ext4: enable large folio for regular file")
> Suggested-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> Signed-off-by: Jia Zhu <zhujia.zj@xxxxxxxxxxxxx>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

Honza

> ---
> fs/buffer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index b0b3792b1496e..c8c41c799030d 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2096,6 +2096,7 @@ void block_commit_write(struct folio *folio, size_t from, size_t to)
> {
> size_t block_start, block_end;
> bool partial = false;
> + bool uptodate = folio_test_uptodate(folio);
> unsigned blocksize;
> struct buffer_head *bh, *head;
>
> @@ -2118,6 +2119,8 @@ void block_commit_write(struct folio *folio, size_t from, size_t to)
> clear_buffer_new(bh);
>
> block_start = block_end;
> + if (uptodate && block_start >= to)
> + break;
> bh = bh->b_this_page;
> } while (bh != head);
>
> --
> 2.20.1
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR