[PATCH] riscv: mm: Fix concurrency in mark_new_valid_map()

From: Vivian Wang

Date: Mon Jun 29 2026 - 05:45:11 EST

Turns out, the concurrency concerns [1] were justified - BOSC reported a
spurious fault in KFENCE that still triggers despite previous fixes,
which KFENCE reports as a false-positive.

Fix the concurrency problems in mark_new_valid_map():

- Add smp_wmb() before filling the bitmap, to make sure page table
writes are "received".
- Use WRITE_ONCE() to fill the bitmap.

This fixes the KFENCE false positives in internal testing.

Also update comments in the assembly exception handler code to match.

Fixes: 26c171fc4853 ("riscv: mm: Use the bitmap API for new_valid_map_cpus")
Reported-by: Yaxing Guo <guoyaxing@xxxxxxxxxx>
Suggested-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
Link: https://lore.kernel.org/linux-riscv/da19ffcf-8042-4f96-9c2d-649468dc6a0a@xxxxxxxxxx/ # [1]
Signed-off-by: Vivian Wang <wangruikang@xxxxxxxxxxx>
---
arch/riscv/include/asm/cacheflush.h | 15 ++++++++++++++-
arch/riscv/kernel/entry.S | 11 ++++-------
2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index 8cfe59483a8f..1f191cf801a2 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -6,7 +6,9 @@
#ifndef _ASM_RISCV_CACHEFLUSH_H
#define _ASM_RISCV_CACHEFLUSH_H

+#include <linux/array_size.h>
#include <linux/mm.h>
+#include <asm/barrier.h>

static inline void local_flush_icache_all(void)
{
@@ -46,12 +48,23 @@ extern DECLARE_BITMAP(new_valid_map_cpus, NR_CPUS);
extern char _end[];
static inline void mark_new_valid_map(void)
{
+ /*
+ * Orders any previous page table writes before setting bits in
+ * new_valid_map_cpus. Pairs with the sfence.vma in
+ * new_valid_map_cpus_check.
+ */
+ smp_wmb();
+
/*
* We don't care if concurrently a cpu resets this value since
* the only place this can happen is in handle_exception() where
* an sfence.vma is emitted.
+ *
+ * Not memset() or bitmap_fill() to avoid any possible compiler
+ * shenanigans.
*/
- bitmap_fill(new_valid_map_cpus, NR_CPUS);
+ for (size_t i = 0; i < ARRAY_SIZE(new_valid_map_cpus); i++)
+ WRITE_ONCE(new_valid_map_cpus[i], -1UL);
}
#define flush_cache_vmap flush_cache_vmap
static inline void flush_cache_vmap(unsigned long start, unsigned long end)
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index c6988983cdf7..092f408edb7d 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -76,6 +76,8 @@
amoxor.d a0, a1, (a0)

/*
+ * Pairs with the smp_wmb() in mark_new_valid_map()
+ *
* A sfence.vma is required here. Even if we had Svvptc, there's no
* guarantee that after returning we wouldn't just fault again.
*/
@@ -141,13 +143,8 @@ SYM_CODE_START(handle_exception)
/*
* The RISC-V kernel does not flush TLBs on all CPUS after each new
* vmalloc mapping or kfence_unprotect(), which may result in
- * exceptions:
- *
- * - if the uarch caches invalid entries, the new mapping would not be
- * observed by the page table walker and an invalidation is needed.
- * - if the uarch does not cache invalid entries, a reordered access
- * could "miss" the new mapping and traps: in that case, we only need
- * to retry the access, no sfence.vma is required.
+ * exceptions. In that case, we need to sfence.vma to "receive" the new
+ * mappings and retry, whether or not we have Svvptc.
*/
new_valid_map_cpus_check
#endif

---
base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
change-id: 20260629-riscv-mm-new-valid-map-ordering-007fa3841332

Best regards,
--
Vivian Wang <wangruikang@xxxxxxxxxxx>