[PATCH 0/2] mm: memory-failure: fix HWPoison flag race with non-atomic page flag ops
From: Michael S. Tsirkin
Date: Sun Jun 28 2026 - 17:45:48 EST
I don't like it that we are adding overhead to the good path for
the benefit of memory failure, which never triggers on many systems,
but I don't have a better idea. Pls take a look.
Non-atomic page flag operations (page->flags.f &= ~mask, __set_bit,
__clear_bit) can race with atomic TestSetPageHWPoison() in
memory_failure(). The non-atomic RMW reads flags, memory_failure()
atomically sets HWPoison, then the RMW writes back the old value
without HWPoison, clobbering the bit.
The race was confirmed by injecting a cpu_relax() delay between the
load and store of the non-atomic RMW in __free_pages_prepare, then
running concurrent MADV_HWPOISON injection. The clobbered HWPoison
bit was observed repeatedly.
This series fixes the race by:
1. Having memory_failure() call synchronize_rcu() + retry after
setting HWPoison, so that any in-flight non-atomic RMW that
read the old flags value completes before we proceed.
2. Wrapping all non-atomic page flag operations in
rcu_read_lock/rcu_read_unlock (CONFIG_MEMORY_FAILURE only),
so that synchronize_rcu() actually drains them.
Performance impact (page alloc+free microbenchmark, 200K iterations,
20 runs, KVM guest, error bars are 3-sigma):
!PREEMPT_RCU (x86):
insns/iter cycles/iter
base: 12237 +/- 1 17954 +/- 136
patched: +22 +/- 1 -124 +/- 122
(+0.18%) (within noise)
PREEMPT_RCU:
insns/iter cycles/iter
base: 12512 +/- 3 18541 +/- 214
patched: +95 +/- 3 -12 +/- 161
(+0.76%) (within noise)
When !CONFIG_MEMORY_FAILURE, all wrappers compile away completely.
Suggested-by: David Hildenbrand <david@xxxxxxxxxx>
Michael S. Tsirkin (2):
mm: memory-failure: use RCU to fix HWPoison flag race
mm: wrap non-atomic page flag ops in RCU for HWPoison safety
include/linux/mm.h | 7 ++++
include/linux/page-flags.h | 81 +++++++++++++++++++++++++++++++++++---
mm/huge_memory.c | 2 +
mm/memory-failure.c | 54 +++++++++++++++++++++----
mm/memremap.c | 6 ++-
mm/mm_init.c | 2 +
mm/page_alloc.c | 4 ++
mm/slub.c | 2 +-
8 files changed, 143 insertions(+), 15 deletions(-)
--
MST