BUG: KCSAN: data-race in do_mremap / vma_complete

From: Jianzhou Zhao

Date: Wed Mar 11 2026 - 04:10:28 EST



Subject: [BUG] mm/mremap: KCSAN: data-race in do_mremap / vma_complete
Dear Maintainers,
We are writing to report a KCSAN-detected data race vulnerability within the memory management subsystem, specifically involving `vma_complete` and `check_mremap_params`. This bug was found by our custom fuzzing tool, RacePilot. The race occurs when `vma_complete` increments the `mm->map_count` concurrently while `check_mremap_params` evaluates the same `current->mm->map_count` without holding the appropriate `mmap_lock` or using atomic snapshot primitives (`READ_ONCE`). We observed this bug on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.
Call Trace & Context
==================================================================
BUG: KCSAN: data-race in do_mremap / vma_complete
write to 0xffff88800c232348 of 4 bytes by task 27920 on cpu 1:
 vma_complete+0x6d2/0x8a0 home/kfuzz/linux/mm/vma.c:354
 __split_vma+0x5fb/0x6f0 home/kfuzz/linux/mm/vma.c:567
 vms_gather_munmap_vmas+0xe5/0x6a0 home/kfuzz/linux/mm/vma.c:1369
 do_vmi_align_munmap+0x2a3/0x450 home/kfuzz/linux/mm/vma.c:1538
 do_vmi_munmap+0x19c/0x2e0 home/kfuzz/linux/mm/vma.c:1596
 do_munmap+0x97/0xc0 home/kfuzz/linux/mm/mmap.c:1068
 mremap_to+0x179/0x240 home/kfuzz/linux/mm/mremap.c:1374
 ...
 __x64_sys_mremap+0x66/0x80 home/kfuzz/linux/mm/mremap.c:1961
read to 0xffff88800c232348 of 4 bytes by task 27919 on cpu 0:
 check_mremap_params home/kfuzz/linux/mm/mremap.c:1816 [inline]
 do_mremap+0x352/0x1090 home/kfuzz/linux/mm/mremap.c:1920
 __do_sys_mremap+0x129/0x160 home/kfuzz/linux/mm/mremap.c:1993
 __se_sys_mremap home/kfuzz/linux/mm/mremap.c:1961 [inline]
 __x64_sys_mremap+0x66/0x80 home/kfuzz/linux/mm/mremap.c:1961
 ...
value changed: 0x0000001f -> 0x00000020
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 27919 Comm: syz.7.1375 Not tainted 6.18.0-08691-g2061f18ad76e-dirty #42 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
==================================================================
Execution Flow & Code Context
In `mm/vma.c`, the `vma_complete()` function finalizes VMA alterations such as insertions. When a new VMA is successfully attached (e.g., during splitting), the function increments the process's `map_count` while holding the necessary `mmap_lock` in write mode from the calling context:
```c
// mm/vma.c
static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi,
    struct mm_struct *mm)
{
 ...
 } else if (vp->insert) {
  /* ... split ... */
  vma_iter_store_new(vmi, vp->insert);
  mm->map_count++; // <-- Plain concurrent write
 }
 ...
}
```
Conversely, the `mremap` syscall validation sequence preemptively evaluates `check_mremap_params()` *before* acquiring the `mmap_lock`. This allows dropping malformed syscalls fast but leaves the map quota check unsynchronized:
```c
// mm/mremap.c
static unsigned long check_mremap_params(struct vma_remap_struct *vrm)
{
 ...
 /* Worst-scenario case ... */
 if ((current->mm->map_count + 2) >= sysctl_max_map_count - 3) // <-- Plain concurrent read
  return -ENOMEM;
 return 0;
}
```
At `mm/mremap.c:1924`, the `mmap_write_lock_killable(mm)` is only acquired *after* `check_mremap_params()` successfully returns.
Root Cause Analysis
A KCSAN data race arises because the `mremap` parameters validator attempts to enact an early heuristic rejection based on the current threshold of `mm->map_count`. However, this evaluation executes entirely without locks (`mmap_lock` is taken subsequently in `do_mremap`). This establishes a plain, lockless read racing against concurrent threads legitimately mutating `mm->map_count` (such as `vma_complete` splitting areas and incrementing the count under the protection of `mmap_lock`). The lack of `READ_ONCE()` combined with a mutating operation provokes the KCSAN alarm and potentially permits compiler load shearing.
Unfortunately, we were unable to generate a reproducer for this bug.
Potential Impact
This data race technically threatens the deterministic outcome of the `mremap` heuristic limit guard. Because `map_count` spans 4 bytes, severe compiler load tearing across cache lines theoretically could trick `check_mremap_params` into accepting or rejecting expansions erratically. Functionally, as a heuristic pre-check, it is virtually benign since a stricter bounded evaluation takes place later under safety locks, but fixing it stops sanitizing infrastructure exhaustion and formalizes the lockless memory access.
Proposed Fix
To inform the compiler and memory models that the read access of `map_count` inside `check_mremap_params` deliberately operates locklessly, we should wrap the evaluation using the `data_race()` macro to suppress KCSAN warnings effectively while conveying intent.
```diff
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -1813,7 +1813,7 @@ static unsigned long check_mremap_params(struct vma_remap_struct *vrm)
   * Check whether current map count plus 2 still leads us to 4 maps below
   * the threshold, otherwise return -ENOMEM here to be more safe.
   */
- if ((current->mm->map_count + 2) >= sysctl_max_map_count - 3)
+ if ((data_race(current->mm->map_count) + 2) >= sysctl_max_map_count - 3)
   return -ENOMEM;
  return 0;
```
We would be highly honored if this could be of any help.
Best regards,
RacePilot Team