Re: [PATCH] ext4: add optional rotating block allocation policy
From: Mario Lohajner
Date: Sun Feb 22 2026 - 18:39:03 EST
On 2/8/26 20:58, Theodore Tso wrote:
On Sun, Feb 08, 2026 at 12:47:12PM +0100, Mario Lohajner wrote:
Someone comes forward with a fork, saying:
“Here is 'my fork'. I believe it may work well for 'some dishes'.”
Give me *proof* that it works on 'some dishes' in terms of actual
perfomance, specifiying real-world workloads, and real-world devices,
and we can talk. "I believe" is not enough for code that upstream has
to test and maintain indefinitely. If it works for you, it's open
source. You can run with an out-of-tree on your systems. But if you
want us to accept it upstream, you need to provide something more than
"I believe".
Cheers,
- Ted
Hi Ted,
sorry for late but lengthy answer.
Of course — no disagreement there. Getting code upstream is serious
business, and I fully agree that “I believe” is not sufficient.
If something is to be merged and maintained long-term, it must be
justified with measurable data on real workloads and real hardware.
Before going further, let me briefly say thank you:
Andreas — thank you for immediately recognizing the core idea and
potential usefulness of the round-robin policy.
Baokun — thank you for suggesting the per-CPU split cursors and
reinforcing the stream allocation direction.
Theodore — thank you for pushing me to rethink both the execution and
the structure of the patch itself.
That feedback directly shaped V2.
*** Whats changed in V2 ***
Stream allocation enforcement:
To promote sequentiality all files are treated as streams, but instead
of relying on a hash array of global goals (with unpredictable scaling),
each inode maintains its own atomic cursor.
This helps preserve intra-file locality and reduces fragmentation by
keeping allocations sequential per inode.
Per-CPU cursors:
It turns out the contention point isn’t the cursor itself, but rather
all CPUs racing for the same blocks.
Per-CPU cursors is the way tp a solution but with a small twist:
Allocation starting points are split per-CPU and evenly distributed
across the LBA space. In effect, this creates LBA zones advancing along
LBA as allocation progresses, it further helps avoiding allocation
hotspots while keeping race conditions and contention in check *without*
mutex or locks.
Preserved is allocator isolation:
Regular allocator is not modified or burdened in any way.
The rotating allocator is selected at mount time (-o rralloc) and
implemented as a separate allocation path.
This keeps the default behavior untouched and allows independent
evolution of "rralloc" policy.
The repository below summarizes the motive and result of the current
design:
https://github.com/mlohajner/RRALLOC
In summary, files are "floating" across the LBA and in-place overwriting
is greatly reduced. It may look counterintuitive, but that is the goal, tested with v6.18.9 stable.
Best regards,
manjo
P.S.
On performance evidence
Like V1, V2 is a policy, meaning:
It relies on established and well-tested ext4 heuristics, adjusted to
produce round-robin allocation behavior.
As proof of principle, it performs round-robin allocation across the LBA
space (and wraps around to the start, thus delivering on the promise).
By design - leveraging existing ext4 heuristics and keeping the rotating
allocator fully separated - performance should remain in line with
current allocator expectations.
Preliminary results confirm that round-robin allocation can be
implemented without compromising (regular) allocator efficiency.