[RFC v2 6/6] ext4: make extsize work with EOF allocations

From: Ojaswin Mujoo
Date: Wed Dec 11 2024 - 02:59:44 EST


Make extsize hints work with EOF allocations. We deviate from XFS here
because in case we have blocks left past EOF, we don't truncate them.
There are 2 main reasons:

1. Since the user is opting for extsize allocations, chances are
that they will use the blocks in future.

2. If we start truncating all EOF blocks in ext4_release_file like
XFS, then we will have to always truncate blocks even if they
have been intentionally preallocated using fallocate w/ KEEP_SIZE
which might cause confusion for users. This is mainly because
ext4 doesn't have a way to distinguish if the blocks beyond EOF
have been allocated intentionally. We can work around this by
using an ondisk inode flag like XFS (XFS_DIFLAG_PREALLOC) but
that would be an overkill. It's much simpler to just let the EOF
blocks stick around.

NOTE:
One thing that changes in this patch is that for direct IO we need to
pass the EXT4_GET_BLOCKS_IO_CREATE_EXT even if we are allocating beyond
i_size.

Signed-off-by: Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx>
---
fs/ext4/inode.c | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d511282ebdcc..d292e39a050a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -756,7 +756,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
* ext4_extents.h here?
*/
int max_unwrit_len = ((1UL << 15) - 1);
- loff_t end;

align = orig_map->m_lblk % extsize;
len = orig_map->m_len + align;
@@ -765,18 +764,6 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode,
extsize_map.m_len =
max_t(unsigned int, roundup_pow_of_two(len), extsize);

- /*
- * For now allocations beyond EOF don't use extsize hints so
- * that we can avoid dealing with extra blocks allocated past
- * EOF. We have inode lock since extsize allocations are
- * non-delalloc so i_size can be accessed safely
- */
- end = (extsize_map.m_lblk + (loff_t)extsize_map.m_len) << inode->i_blkbits;
- if (end > inode->i_size) {
- flags = orig_flags & ~EXT4_GET_BLOCKS_EXTSIZE;
- goto set_map;
- }
-
/* Fallback to normal allocation if we go beyond max len */
if (extsize_map.m_len >= max_unwrit_len) {
flags = orig_flags & ~EXT4_GET_BLOCKS_EXTSIZE;
@@ -3641,10 +3628,13 @@ static int ext4_iomap_alloc(struct inode *inode, struct ext4_map_blocks *map,
* i_disksize out to i_size. This could be beyond where direct I/O is
* happening and thus expose allocated blocks to direct I/O reads.
*
- * NOTE for extsize hints: We only support it for writes inside
- * EOF (for now) to not have to deal with blocks past EOF
+ * NOTE: For extsize hint based EOF allocations, we still need
+ * IO_CREATE_EXT flag because we will be allocating more than the write
+ * hence the extra blocks need to be marked unwritten and split before
+ * the I/O.
*/
- else if (((loff_t)map->m_lblk << blkbits) >= i_size_read(inode))
+ else if (((loff_t)map->m_lblk << blkbits) >= i_size_read(inode) &&
+ !ext4_should_use_extsize(inode))
m_flags = EXT4_GET_BLOCKS_CREATE;
else if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
m_flags = EXT4_GET_BLOCKS_IO_CREATE_EXT;
--
2.43.5