KCSAN: data-race in __folio_batch_add_and_move / __lru_add_drain_all

From: Jianzhou Zhao

Date: Wed Mar 11 2026 - 03:34:53 EST

Subject: [BUG] mm, swap: KCSAN: data-race in __folio_batch_add_and_move / __lru_add_drain_all
Dear Maintainers,
We are writing to report a KCSAN-detected data-race vulnerability in the Linux kernel. This bug was found by our custom fuzzing tool, RacePilot. The bug occurs during concurrent folio batch addition and the global LRU drain operations in the memory management subsystem. We observed this on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.
Call Trace & Context
==================================================================
BUG: KCSAN: data-race in __folio_batch_add_and_move / __lru_add_drain_all
write to 0xffff88807dd267e0 of 1 bytes by task 11894 on cpu 1:
folio_batch_add include/linux/pagevec.h:80 [inline]
__folio_batch_add_and_move+0x7f/0x1b0 mm/swap.c:196
folio_add_lru+0xbe/0xd0 mm/swap.c:511
folio_add_lru_vma+0x47/0x70 mm/swap.c:530
wp_page_copy mm/memory.c:3784 [inline]
do_wp_page+0xda9/0x2000 mm/memory.c:4180
handle_pte_fault mm/memory.c:6303 [inline]
__handle_mm_fault+0xb6c/0x21f0 mm/memory.c:6421
handle_mm_fault+0x2ee/0x820 mm/memory.c:6590
do_user_addr_fault arch/x86/mm/fault.c:1336 [inline]
handle_page_fault arch/x86/mm/fault.c:1476 [inline]
exc_page_fault+0x398/0x10d0 arch/x86/mm/fault.c:1532
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618
read to 0xffff88807dd267e0 of 1 bytes by task 5526 on cpu 0:
folio_batch_count include/linux/pagevec.h:58 [inline]
cpu_needs_drain mm/swap.c:784 [inline]
__lru_add_drain_all+0x2e3/0x5a0 mm/swap.c:881
lru_add_drain_all+0x10/0x20 mm/swap.c:903
invalidate_bdev+0x7a/0xb0 block/bdev.c:106
ext4_put_super+0x5dd/0x8f0 fs/ext4/super.c:1348
generic_shutdown_super+0xec/0x200 fs/super.c:643
kill_block_super+0x29/0x60 fs/super.c:1730
ext4_kill_sb+0x48/0x90 fs/ext4/super.c:7444
deactivate_locked_super+0x72/0x210 fs/super.c:474
deactivate_super fs/super.c:507 [inline]
deactivate_super+0x8b/0xa0 fs/super.c:503
cleanup_mnt+0x22f/0x2c0 fs/namespace.c:1318
__cleanup_mnt+0x16/0x20 fs/namespace.c:1325
task_work_run+0x105/0x190 kernel/task_work.c:233
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
__exit_to_user_mode_loop kernel/entry/common.c:44 [inline]
exit_to_user_mode_loop+0x129/0x7d0 kernel/entry/common.c:75
__exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
do_syscall_64+0x27f/0x2c0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0x0f -> 0x10
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 5526 Comm: syz-executor Not tainted 6.18.0-08691-g2061f18ad76e-dirty #44 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
==================================================================
Execution Flow & Code Context
On CPU 1, a page fault handler allocates a new folio and inserts it into the LRU list, hitting `__folio_batch_add_and_move()`. Here, local CPU batch updates happen via `folio_batch_add()`, generating a plain write to `fbatch->nr`:
```c
// include/linux/pagevec.h
static inline unsigned folio_batch_add(struct folio_batch *fbatch,
struct folio *folio)
{
fbatch->folios[fbatch->nr++] = folio; // <-- Write
return folio_batch_space(fbatch);
}
```
Meanwhile, on CPU 0, `lru_add_drain_all()` iterate over all online CPUs and checks if their local folio batches require flushing. The check relies on `cpu_needs_drain()`, calling `folio_batch_count()` to inspect remote CPU `fbatch` fields. This translates to an unannotated read over `fbatch->nr`:
```c
// mm/swap.c
static bool cpu_needs_drain(unsigned int cpu)
{
struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
/* Check these in order of likelihood that they're not zero */
return folio_batch_count(&fbatches->lru_add) || // <-- Lockless Read
folio_batch_count(&fbatches->lru_move_tail) ||
...
}
// include/linux/pagevec.h
static inline unsigned int folio_batch_count(const struct folio_batch *fbatch)
{
return fbatch->nr; // <-- Lockless Read
}
```
Root Cause Analysis
A data race occurs because writer access to `fbatch->nr` via `folio_batch_add()` is strictly local to its CPU and executed using plain pointer assignments without cross-CPU locks. Meanwhile, `cpu_needs_drain()` remotely iterates across all CPUs to check `fbatch->nr` and conservatively determine if LRU drainage work needs to be globally scheduled. Since it reads the counters from other CPUs without synchronization, this is an intentional, benign data race utilized as an optimization heuristic.
Unfortunately, we were unable to generate a reproducer for this bug.
Potential Impact
Because the access is a benign cross-cpu inspection to determine if a drain task should be offloaded, reading a slightly stale value just leads to skipping the drain (if read 0) or scheduling an unnecessary, redundant drain work (if read > 0 when empty). It isn't functionally critical. However, KCSAN flagging it produces noise that hides more critical data races.
Proposed Fix
Since `cpu_needs_drain()` is designed to operate heuristically and tolerates races safely, the unannotated reads over `folio_batch_count()` can be explicitly wrapped with the `data_race()` macro to annotate the intentional lockless reads, silencing the sanitizer properly.
```diff
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -781,12 +781,12 @@ static bool cpu_needs_drain(unsigned int cpu)
struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
/* Check these in order of likelihood that they're not zero */
- return folio_batch_count(&fbatches->lru_add) ||
- folio_batch_count(&fbatches->lru_move_tail) ||
- folio_batch_count(&fbatches->lru_deactivate_file) ||
- folio_batch_count(&fbatches->lru_deactivate) ||
- folio_batch_count(&fbatches->lru_lazyfree) ||
- folio_batch_count(&fbatches->lru_activate) ||
+ return data_race(folio_batch_count(&fbatches->lru_add)) ||
+ data_race(folio_batch_count(&fbatches->lru_move_tail)) ||
+ data_race(folio_batch_count(&fbatches->lru_deactivate_file)) ||
+ data_race(folio_batch_count(&fbatches->lru_deactivate)) ||
+ data_race(folio_batch_count(&fbatches->lru_lazyfree)) ||
+ data_race(folio_batch_count(&fbatches->lru_activate)) ||
need_mlock_drain(cpu) || has_bh_in_lru(cpu, NULL);
}
```
We would be highly honored if this could be of any help.
Best regards,
RacePilot Team