[RFC v3 00/11] ext4: Add extsize and forcealign support (groundwork for multi block atomic writes)

From: Ojaswin Mujoo
Date: Mon Mar 24 2025 - 03:37:42 EST


These patches lay the ground work for supporting multi block
HW-accelerated atomic writes without the use of bigalloc. Multiblock
atomic write support with bigalloc is already posted as an RFC here [3].
Without bigalloc, we need a mechanism to get aligned blocks from the
allocator so that HW accelerated atomic writes can be performed. extsize
+ forcealign provide this mechanism in ext4.

[3] https://lore.kernel.org/linux-ext4/cover.1742699765.git.ritesh.list@xxxxxxxxx/

- extsize is a per inode hint to physically and logically align blocks
to a certain value.

- forcealign gives a **strict guarantee** that allocator will physically
as well as logically align blocks to the extsize value

The extsize support is almost same as v2 with rebase to latest ext4 dev
branch. Patches 7 - 11 adds the new forcealign feature that can be
seen like a sort of per file bigalloc. Some points about forcealign:

* Allocation on a forcealign inode is guaranteed to get an extent
aligned to extsize physicall and logically, else error is returned.
This mimicks bigalloc but on a per file level

* Deallocations are also only allowed in extsize aligned units. This is
pretty strict and can be relaxed in later revisions.

* FS_XFLAG_FORCEALIGN can be set via FS_IOC_GET/SETXATTR ioctl to
enable forcealign. As of now, we can only enable forcealign if
extsize is set on the inode

* Reused the EXT4_EOFBLOCKS_FL flag for forcealig since it is no longer
used. Incase this is not feasible, we can explore other ways to set
the flag (eg xattr or overriding a field)

Some of the TODOs and open questions regarding the design:

1. I want to design forcealign in such a way that FS formatting is not
required. For that Im exploring 2 options:

- Add an RO_COMPAT feature flag. tune2fs can be used to enable it on
existing filesystems without formatting. Simplest but this has a
drawback that even for a single forcealign file, the FS would become
RO on older kernels

- To avoid that, we can instead expose an ioctl to fix a misaligned
forcealign file. However this is an overhead for sys admins/end
users. Maybe fsck can help with this?

2. For extsize, I'm not planning to support FS-wide tunable since we
already have bigalloc for that.

3. Also, we are not supporting non-power-of-2 extsizes (atleast for now)
as there are no clear use cases to justify the added complexity

4. directory wide extsize is not yet supported however can be added in
future revision

We are passing quick xfstests with these patches along with a lot of
custom allocation scenarios that I'll eventually add to xfstest, however
this series is still largely an RFC and might have bugs.

Posting this here for review and suggestions on the design as well as
implementation.


** Changes since rfc v2 [2] **

- Patch 0-6 are same as v2 just rebased. Patch 7-11 are new in this
series.
- Patch 7 adds a wrapper on ext4_map_blocks to better handle some
allocation scenarios
- Patch 8-11 Add a new called forcealign. More about it below.

[2] https://lore.kernel.org/linux-ext4/cover.1733901374.git.ojaswin@xxxxxxxxxxxxx/

** Changes since rfc v1 [1] **

1. Allocations beyond EOF also respect extsize hint however we
unlink XFS, we don't trim the blocks allocated beyond EOF due
to extsize hints. The reasoning behind this is explained in
patch 6/6.

2. Minor fixes in extsize ioctl handling logic.

Rest of the design detials can be in individual patches as well as
the original cover leter which can be found here:

[1]
https://lore.kernel.org/linux-ext4/cover.1726034272.git.ojaswin@xxxxxxxxxxxxx/

Comments and suggestions are welcome!

Regards,
ojaswin

Ojaswin Mujoo (11):
ext4: add aligned allocation hint in mballoc
ext4: allow inode preallocation for aligned alloc
ext4: support for extsize hint using FS_IOC_FS(GET/SET)XATTR
ext4: pass lblk and len explicitly to ext4_split_extent*()
ext4: add extsize hint support
ext4: make extsize work with EOF allocations
ext4: add ext4_map_blocks_extsize() wrapper to handle overwrites
ext4: add forcealign support of mballoc
ext4: add forcealign support to ext4_map_blocks
ext4: add support for adding focealign via SETXATTR ioctl
ext4: disallow unaligned deallocations on forcealign inodes

fs/ext4/ext4.h | 20 +-
fs/ext4/ext4_jbd2.h | 23 ++
fs/ext4/extents.c | 294 ++++++++++++++++---
fs/ext4/inode.c | 543 +++++++++++++++++++++++++++++++++---
fs/ext4/ioctl.c | 191 +++++++++++++
fs/ext4/mballoc.c | 141 ++++++++--
fs/ext4/super.c | 1 +
include/trace/events/ext4.h | 3 +
include/uapi/linux/fs.h | 6 +-
9 files changed, 1111 insertions(+), 111 deletions(-)

--
2.48.1