[PATCH 0/2] ext4: reduce fast-commit write amplification for scattered writes

From: Daejun Park

Date: Thu Jun 11 2026 - 00:48:02 EST


ext4 fast commit tracks a single coalesced logical range per inode. When
an inode is dirtied at several disjoint offsets between two commits (e.g.
sparse/scattered random writes), that range is widened to span [min, max]
of all the touched offsets, and ext4_fc_write_inode_data() then re-logs
every extent inside that span -- including the unmodified ones. On sparse
allocation this inflates fast-commit traffic and frequently overflows the
fast-commit area, forcing a fallback to a full jbd2 commit.

This series replaces the single range with a small, bounded set of disjoint
ranges so that only the actually-modified regions are logged, while keeping
the per-inode memory cost negligible:

1/2 tracks up to EXT4_FC_MAX_RANGES (16) disjoint ranges, merging the two
closest ranges when the set would overflow -- so the worst case
degrades gracefully to the old single-span behaviour. The on-disk
fast-commit (TLV) format is unchanged.

2/2 allocates that array lazily: the first range is kept inline, the array
is allocated only when a second disjoint range appears, and on an
allocation failure we fall back to the inline single range. The
per-inode fast-commit footprint drops from ~140 to 20 bytes.

Measured on a sparse random-write workload (1 GiB span, R disjoint dirty
regions per fsync, 300 fsyncs, bare-metal NVMe):

- fast-commit blocks per commit (R=16): 18.6 -> 1.0
- full-commit fallback rate (R=16): 22% -> 2% (on a small fs)
- mean fsync latency: R=16 -10%, R=64 -14%
- p99 fsync latency: R=16 -31%

The p99 improvement comes from eliminating the full-commit fallback spikes.

Testing: crash recovery (power loss -> fast-commit replay -> verify every
fsync'd block, then e2fsck) is clean; the ext4/generic fast-commit xfstests
show no regression; the unchanged on-disk format means e2fsprogs needs no
update. Both patches are checkpatch --strict clean.

Based on v6.17-rc3.

Daejun Park (2):
ext4: track multiple disjoint fast-commit ranges per inode
ext4: allocate the fast-commit range array lazily

fs/ext4/ext4.h | 40 +++++++--
fs/ext4/fast_commit.c | 196 +++++++++++++++++++++++++++++++++++-------
fs/ext4/super.c | 1 +
3 files changed, 199 insertions(+), 38 deletions(-)

--
2.43.0