Re: [PATCH 0/2] mm: memory-failure: fix HWPoison flag race with non-atomic page flag ops

From: David Hildenbrand (Arm)

Date: Mon Jun 29 2026 - 02:54:19 EST


On 6/28/26 23:45, Michael S. Tsirkin wrote:
> I don't like it that we are adding overhead to the good path for
> the benefit of memory failure, which never triggers on many systems,
> but I don't have a better idea. Pls take a look.

As I said on Friday.

"It's also doesn't address the mf_mutex implications and the x86 thingies I
mentioned.

...

I'll either take care of that myself or find someone that can work on this with
attention to all details.
"

This is nothing to vibe-code. This needs a real expert.


>
> Non-atomic page flag operations (page->flags.f &= ~mask, __set_bit,
> __clear_bit) can race with atomic TestSetPageHWPoison() in
> memory_failure(). The non-atomic RMW reads flags, memory_failure()
> atomically sets HWPoison, then the RMW writes back the old value
> without HWPoison, clobbering the bit.
>
> The race was confirmed by injecting a cpu_relax() delay between the
> load and store of the non-atomic RMW in __free_pages_prepare, then
> running concurrent MADV_HWPOISON injection. The clobbered HWPoison
> bit was observed repeatedly.
>
> This series fixes the race by:
>
> 1. Having memory_failure() call synchronize_rcu() + retry after
> setting HWPoison, so that any in-flight non-atomic RMW that
> read the old flags value completes before we proceed.
>
> 2. Wrapping all non-atomic page flag operations in
> rcu_read_lock/rcu_read_unlock (CONFIG_MEMORY_FAILURE only),
> so that synchronize_rcu() actually drains them.
>
> Performance impact (page alloc+free microbenchmark, 200K iterations,
> 20 runs, KVM guest, error bars are 3-sigma):
>
> !PREEMPT_RCU (x86):
> insns/iter cycles/iter
> base: 12237 +/- 1 17954 +/- 136
> patched: +22 +/- 1 -124 +/- 122
> (+0.18%) (within noise)
>
> PREEMPT_RCU:
> insns/iter cycles/iter
> base: 12512 +/- 3 18541 +/- 214
> patched: +95 +/- 3 -12 +/- 161
> (+0.76%) (within noise)
>
> When !CONFIG_MEMORY_FAILURE, all wrappers compile away completely.
>
> Suggested-by: David Hildenbrand <david@xxxxxxxxxx>

No ;)

--
Cheers,

David