Re: [PATCH] ntfs: avoid stale runlist element dereference in fallocate

From: Hyunchul Lee

Date: Fri Jun 26 2026 - 03:59:02 EST


2026년 6월 25일 (목) 오후 5:34, Cen Zhang <zzzccc427@xxxxxxxxx>님이 작성:
>
> ntfs_attr_fallocate() walks the data runlist when it allocates holes or
> delayed allocations inside initialized size. It looks up a struct
> runlist_element under ni->runlist.lock, drops the lock, and then reads
> rl->lcn, rl->length and rl->vcn to decide how far the current extent can
> advance.
>
> That pointer is only borrowed from ni->runlist.rl. A racing mmap
> page_mkwrite path can enter __ntfs_write_iomap_begin(), map clusters, and
> replace the same runlist through ntfs_runlists_merge(). When
> ntfs_rl_realloc() allocates replacement storage, the old runlist array is
> freed while fallocate still carries the stale element pointer.
>
> The buggy scenario involves two paths, with each column showing the order
> within that path:
>
> ntfs_attr_fallocate(): mmap page_mkwrite path:
> 1. Look up rl under 1. Enter __ntfs_write_iomap_begin().
> ni->runlist.lock. 2. Map clusters through
> 2. Drop ni->runlist.lock. ntfs_attr_map_cluster().
> 3. Read rl->lcn, rl->length 3. Merge a replacement runlist.
> and rl->vcn. 4. Free the old runlist array.
>
> Snapshot the needed runlist fields while ni->runlist.lock is still held,
> and use only those scalars after unlock. This preserves the existing
> fallocate decisions without extending the lock over the later allocation
> and zeroing work.
>
> Validation reproduced this kernel report:
> BUG: KASAN: slab-use-after-free in ntfs_attr_fallocate+0xbb8/0xd00
>
> Call Trace:
> <TASK>
> dump_stack_lvl+0x66/0xa0
> print_report+0xce/0x630
> ? ntfs_attr_fallocate+0xbb8/0xd00
> ? srso_alias_return_thunk+0x5/0xfbef5
> ? __virt_addr_valid+0x20d/0x410
> ? ntfs_attr_fallocate+0xbb8/0xd00
> kasan_report+0xe0/0x110
> ? ntfs_attr_fallocate+0xbb8/0xd00
> ntfs_attr_fallocate+0xbb8/0xd00
> ? lock_acquire+0x2b8/0x2f0
> ? __pfx_ntfs_attr_fallocate+0x10/0x10
> ? 0xffffffffc0000095
> ? down_write+0x10d/0x1e0
> ntfs_fallocate+0x5c9/0x1d00
> ? __pfx_ntfs_fallocate+0x10/0x10
> ? srso_alias_return_thunk+0x5/0xfbef5
> ? lock_acquire+0x2b8/0x2f0
> ? srso_alias_return_thunk+0x5/0xfbef5
> ? selinux_file_permission+0x3a7/0x510
> vfs_fallocate+0x29d/0xd30
> __x64_sys_fallocate+0xc7/0x150
> ? do_syscall_64+0x81/0x6a0
> do_syscall_64+0x115/0x6a0
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Allocated by task 410:
> kasan_save_stack+0x33/0x60
> kasan_save_track+0x14/0x30
> __kasan_kmalloc+0xaa/0xb0
> __kvmalloc_node_noprof+0x353/0x920
> ntfs_rl_realloc+0x3f/0x110
> ntfs_runlists_merge+0xaa3/0x3010
> ntfs_attr_map_cluster+0x4e5/0xf80
> ntfs_attr_fallocate+0x53f/0xd00
> ntfs_fallocate+0x5c9/0x1d00
> vfs_fallocate+0x29d/0xd30
> __x64_sys_fallocate+0xc7/0x150
> do_syscall_64+0x115/0x6a0
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Freed by task 424:
> kasan_save_stack+0x33/0x60
> kasan_save_track+0x14/0x30
> kasan_save_free_info+0x3b/0x60
> __kasan_slab_free+0x5f/0x80
> kfree+0x307/0x580
> ntfs_rl_realloc+0x6f/0x110
> ntfs_runlists_merge+0x7b1/0x3010
> ntfs_attr_map_cluster+0x4e5/0xf80
> __ntfs_write_iomap_begin+0x8cd/0x2280
> iomap_iter+0x6de/0x11e0
> iomap_page_mkwrite+0x391/0x650
> ntfs_filemap_page_mkwrite+0x1ac/0x400
> do_page_mkwrite+0x15c/0x280
> __handle_mm_fault+0xd6d/0x1ca0
> handle_mm_fault+0x19c/0x470
> do_user_addr_fault+0x23b/0x9c0
> exc_page_fault+0x5c/0xc0
> asm_exc_page_fault+0x26/0x30
>
> Fixes: 495e90fa3348 ("ntfs: update attrib operations")
> Assisted-by: Codex:gpt-5.5
> Signed-off-by: Cen Zhang <zzzccc427@xxxxxxxxx>

Looks good to me.

Reviewed-by: Hyunchul Lee <hyc.lee@xxxxxxxxx>

Additionally, we need to skip ntfs_dio_zero_range() if
clusters are not allocated.

@@ -5647,9 +5647,10 @@ int ntfs_attr_fallocate(struct ntfs_inode *ni,
loff_t start, loff_t byte_len, bo
if (err)
goto out;

- err = ntfs_dio_zero_range(VFS_I(ni),
- lcn <<
vol->cluster_size_bits,
- alloc_cnt <<
vol->cluster_size_bits);
+ if (balloc)
+ err = ntfs_dio_zero_range(VFS_I(ni),
+ lcn
<< vol->cluster_size_bits,
+
alloc_cnt << vol->cluster_size_bits);


> ---
> fs/ntfs/attrib.c | 17 +++++++++++------
> 1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ntfs/attrib.c b/fs/ntfs/attrib.c
> index dd8828098511..73e01f32c067 100644
> --- a/fs/ntfs/attrib.c
> +++ b/fs/ntfs/attrib.c
> @@ -5536,6 +5536,7 @@ int ntfs_attr_fallocate(struct ntfs_inode *ni, loff_t start, loff_t byte_len, bo
> s64 old_data_size;
> s64 vcn_start, vcn_end, vcn_uninit, vcn, try_alloc_cnt;
> s64 lcn, alloc_cnt;
> + s64 rl_lcn, rl_length, rl_vcn;
> int err = 0;
> struct runlist_element *rl;
> bool balloc;
> @@ -5615,19 +5616,23 @@ int ntfs_attr_fallocate(struct ntfs_inode *ni, loff_t start, loff_t byte_len, bo
> while (vcn < vcn_uninit) {
> down_read(&ni->runlist.lock);
> rl = ntfs_attr_find_vcn_nolock(ni, vcn, NULL);
> - up_read(&ni->runlist.lock);
> if (IS_ERR(rl)) {
> + up_read(&ni->runlist.lock);
> err = PTR_ERR(rl);
> goto out;
> }
> + rl_lcn = rl->lcn;
> + rl_length = rl->length;
> + rl_vcn = rl->vcn;
> + up_read(&ni->runlist.lock);
>
> - if (rl->lcn > 0) {
> - vcn += rl->length - (vcn - rl->vcn);
> - } else if (rl->lcn == LCN_DELALLOC || rl->lcn == LCN_HOLE) {
> - try_alloc_cnt = min(rl->length - (vcn - rl->vcn),
> + if (rl_lcn > 0) {
> + vcn += rl_length - (vcn - rl_vcn);
> + } else if (rl_lcn == LCN_DELALLOC || rl_lcn == LCN_HOLE) {
> + try_alloc_cnt = min(rl_length - (vcn - rl_vcn),
> vcn_uninit - vcn);
>
> - if (rl->lcn == LCN_DELALLOC) {
> + if (rl_lcn == LCN_DELALLOC) {
> vcn += try_alloc_cnt;
> continue;
> }
> --
> 2.43.0
>


--
Thanks,
Hyunchul