[PATCH v2] ntfs: avoid stale runlist element dereference in fallocate

From: Cen Zhang

Date: Sat Jun 27 2026 - 00:28:18 EST


ntfs_attr_fallocate() allocates holes and delayed allocations inside
initialized size by looking up the current runlist element under
ni->runlist.lock. The returned struct runlist_element is only a borrowed
pointer into ni->runlist.rl. A writer can replace and free that array
after the read lock is dropped, so later reads of rl->lcn, rl->length and
rl->vcn can touch freed memory.

The buggy scenario involves two paths, with each column showing the order
within that path:

ntfs_attr_fallocate():
1. Take ni->runlist.lock for read.
2. Get rl from ntfs_attr_find_vcn_nolock().
3. Drop ni->runlist.lock.
4. Read rl->lcn, rl->length and rl->vcn.

mmap page_mkwrite:
1. Enter ntfs_filemap_page_mkwrite().
2. Reach __ntfs_write_iomap_begin() and ntfs_attr_map_cluster().
3. Merge allocation state with ntfs_runlists_merge().
4. Reallocate ni->runlist.rl in ntfs_rl_realloc(), freeing the old array.

Validation reproduced this kernel report:
BUG: KASAN: slab-use-after-free in ntfs_attr_fallocate+0xbb8/0xd00

Call Trace:
<TASK>
dump_stack_lvl+0x66/0xa0
print_report+0xce/0x630
? ntfs_attr_fallocate+0xbb8/0xd00
? srso_alias_return_thunk+0x5/0xfbef5
? __virt_addr_valid+0x20d/0x410
? ntfs_attr_fallocate+0xbb8/0xd00
kasan_report+0xe0/0x110
? ntfs_attr_fallocate+0xbb8/0xd00
ntfs_attr_fallocate+0xbb8/0xd00
? lock_acquire+0x2b8/0x2f0
? __pfx_ntfs_attr_fallocate+0x10/0x10
? 0xffffffffc0000095
? down_write+0x10d/0x1e0
ntfs_fallocate+0x5c9/0x1d00
? __pfx_ntfs_fallocate+0x10/0x10
? srso_alias_return_thunk+0x5/0xfbef5
? lock_acquire+0x2b8/0x2f0
? srso_alias_return_thunk+0x5/0xfbef5
? selinux_file_permission+0x3a7/0x510
vfs_fallocate+0x29d/0xd30
__x64_sys_fallocate+0xc7/0x150
? do_syscall_64+0x81/0x6a0
do_syscall_64+0x115/0x6a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Allocated by task 410:
kasan_save_stack+0x33/0x60
kasan_save_track+0x14/0x30
__kasan_kmalloc+0xaa/0xb0
__kvmalloc_node_noprof+0x353/0x920
ntfs_rl_realloc+0x3f/0x110
ntfs_runlists_merge+0xaa3/0x3010
ntfs_attr_map_cluster+0x4e5/0xf80
ntfs_attr_fallocate+0x53f/0xd00
ntfs_fallocate+0x5c9/0x1d00
vfs_fallocate+0x29d/0xd30
__x64_sys_fallocate+0xc7/0x150
do_syscall_64+0x115/0x6a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 424:
kasan_save_stack+0x33/0x60
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x5f/0x80
kfree+0x307/0x580
ntfs_rl_realloc+0x6f/0x110
ntfs_runlists_merge+0x7b1/0x3010
ntfs_attr_map_cluster+0x4e5/0xf80
__ntfs_write_iomap_begin+0x8cd/0x2280
iomap_iter+0x6de/0x11e0
iomap_page_mkwrite+0x391/0x650
ntfs_filemap_page_mkwrite+0x1ac/0x400
do_page_mkwrite+0x15c/0x280
__handle_mm_fault+0xd6d/0x1ca0
handle_mm_fault+0x19c/0x470
do_user_addr_fault+0x23b/0x9c0
exc_page_fault+0x5c/0xc0
asm_exc_page_fault+0x26/0x30

Fix this by copying the needed runlist fields while the read lock is still
held and using only those scalar snapshots after unlocking.

After the snapshot, ntfs_attr_map_cluster() can also find that the range
is already mapped and return balloc=false. Only call ntfs_dio_zero_range()
when new clusters were allocated, matching the write iomap path and
preserving the zero-newly-allocated-holes behavior.

Reviewed-by: Hyunchul Lee <hyc.lee@xxxxxxxxx>
Fixes: 495e90fa3348 ("ntfs: update attrib operations")
Assisted-by: Codex:gpt-5.5
Signed-off-by: Cen Zhang <zzzccc427@xxxxxxxxx>
---
v2:
Skip ntfs_dio_zero_range() when ntfs_attr_map_cluster() reports that no
clusters were allocated, as suggested by Hyunchul Lee.

fs/ntfs/attrib.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/fs/ntfs/attrib.c b/fs/ntfs/attrib.c
index dd8828098511..a99b84751eb1 100644
--- a/fs/ntfs/attrib.c
+++ b/fs/ntfs/attrib.c
@@ -5536,6 +5536,7 @@ int ntfs_attr_fallocate(struct ntfs_inode *ni, loff_t start, loff_t byte_len, bo
s64 old_data_size;
s64 vcn_start, vcn_end, vcn_uninit, vcn, try_alloc_cnt;
s64 lcn, alloc_cnt;
+ s64 rl_lcn, rl_length, rl_vcn;
int err = 0;
struct runlist_element *rl;
bool balloc;
@@ -5615,19 +5616,23 @@ int ntfs_attr_fallocate(struct ntfs_inode *ni, loff_t start, loff_t byte_len, bo
while (vcn < vcn_uninit) {
down_read(&ni->runlist.lock);
rl = ntfs_attr_find_vcn_nolock(ni, vcn, NULL);
- up_read(&ni->runlist.lock);
if (IS_ERR(rl)) {
+ up_read(&ni->runlist.lock);
err = PTR_ERR(rl);
goto out;
}
+ rl_lcn = rl->lcn;
+ rl_length = rl->length;
+ rl_vcn = rl->vcn;
+ up_read(&ni->runlist.lock);

- if (rl->lcn > 0) {
- vcn += rl->length - (vcn - rl->vcn);
- } else if (rl->lcn == LCN_DELALLOC || rl->lcn == LCN_HOLE) {
- try_alloc_cnt = min(rl->length - (vcn - rl->vcn),
+ if (rl_lcn > 0) {
+ vcn += rl_length - (vcn - rl_vcn);
+ } else if (rl_lcn == LCN_DELALLOC || rl_lcn == LCN_HOLE) {
+ try_alloc_cnt = min(rl_length - (vcn - rl_vcn),
vcn_uninit - vcn);

- if (rl->lcn == LCN_DELALLOC) {
+ if (rl_lcn == LCN_DELALLOC) {
vcn += try_alloc_cnt;
continue;
}
@@ -5642,11 +5647,14 @@ int ntfs_attr_fallocate(struct ntfs_inode *ni, loff_t start, loff_t byte_len, bo
if (err)
goto out;

- err = ntfs_dio_zero_range(VFS_I(ni),
- lcn << vol->cluster_size_bits,
- alloc_cnt << vol->cluster_size_bits);
- if (err > 0)
- goto out;
+ if (balloc) {
+ err = ntfs_dio_zero_range(VFS_I(ni),
+ lcn << vol->cluster_size_bits,
+ alloc_cnt <<
+ vol->cluster_size_bits);
+ if (err > 0)
+ goto out;
+ }

if (signal_pending(current))
goto out;
--
2.43.0