Re: [PATCH 0/2] mm: memory-failure: fix HWPoison flag race with non-atomic page flag ops
From: David Hildenbrand (Arm)
Date: Tue Jun 30 2026 - 02:30:27 EST
On 6/29/26 23:50, Michael S. Tsirkin wrote:
> On Mon, Jun 29, 2026 at 11:22:11PM +0200, David Hildenbrand (Arm) wrote:
>> [...]
>>
>>>
>>> And again, I'm really not sure fixing a theoretical race when memory
>>> is failing is worth slowing the world by 0.1-1% for.
>>>
>>
>> Fully agreed. I was hoping RCU was cheaper (I mean, we were once told that RCU
>> read side locking is essentially for free ... well in some configs :) )
>>
>> The question if we could optimize it reasonably enough ...
>>
>>>
>>> From what I saw in my testing, if we allocate 4k pages
>>> it's hidden by the overhead. With hp and thp it's measureably
>>> worse than rcu on !preempt config.
>>
>> ... for example, by doing the rcu read lock + unlock around the
>>
>> for (i = 1; i < (1 << order); i++) {
>>
>> loop on the alloc path.
>
> Is this different from what this patch is doing?
Ah, I missed that we batch this already. We could make it include the
page_cpupid_reset_last(page);
page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
As well, to reduce from 3 to 1 locks.
So I guess there is potential for optimization.
[...]
>
>> I concluded, similar to Andi, that stop_machine() is too big of a hammer.
>>
>> I wonder if something could be built out of preempt_disable() and sync SMP
>> calls. hmm :(
>
> rcu_lock is basically same as preempt_disable if rcu is non preemptible,
> no?
Yes. See my other mail, I learned that preempt_disable() should likely just do
for our use case. So the preemptible RCU case would not matter.
I assume that's as good as it gets.
1) Use preempt_disable/preempt_enable to protect
2) Batch as good as possible in the page allocator
If the overhead is then still noticeable, there is not a lot we can do to handle
this cleanly I'm afraid.
--
Cheers,
David