Re: [PATCH v2 3/3] drivers/base/memory: fix locking for poison accounting lookup
From: Muchun Song
Date: Tue Apr 28 2026 - 10:22:26 EST
> On Apr 28, 2026, at 20:34, Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote:
> On 2026/4/28 19:40, Muchun Song wrote:
>>
>>
>>> On Apr 28, 2026, at 19:37, Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote:
>>> On 2026/4/28 16:52, Muchun Song wrote:
>>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() call
>>>> find_memory_block_by_id(), which requires device_hotplug_lock to
>>>> serialize the xarray lookup against memory block removal.
>>>> Take device_hotplug_lock around the lookup and nr_hwpoison update so
>>>> the memory block cannot disappear between xa_load() and get_device().
>>>> Fixes: 5033091de814 ("mm/hwpoison: introduce per-memory_block hwpoison counter")
>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>> Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
>>> Thanks for update.
>>>> ---
>>>> drivers/base/memory.c | 10 ++++++++--
>>>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>>>> index 6981b55d582a..f76aee29e9a5 100644
>>>> --- a/drivers/base/memory.c
>>>> +++ b/drivers/base/memory.c
>>>> @@ -1228,23 +1228,29 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func,
>>>> void memblk_nr_poison_inc(unsigned long pfn)
>>>> {
>>>> const unsigned long block_id = pfn_to_block_id(pfn);
>>>> - struct memory_block *mem = find_memory_block_by_id(block_id);
>>>> + struct memory_block *mem;
>>>> + lock_device_hotplug();
>>> memblk_nr_poison_inc() and memblk_nr_poison_sub() are both called from memory_failure() context.
>>> I'm afraid if memory_failure() is triggered while lock_device_hotplug is held, it will lead to
>>> deadlock. Or am I miss something?
>>
>> I am curious is there any place where memory_failure() is called with holding lock_device_hotplug?
>
> Sorry for dumb scenario, I was a bit too presumptuous. But there might be another possible deadlock:
>
> remove_memory
> lock_device_hotplug <-- first called here
> try_remove_memory
> remove_memory_block_devices
> num_poisoned_pages_sub
Passing pfn = -1 here.
> memblk_nr_poison_sub
> lock_device_hotplug <-- deadlock here
No. Can’t reach here. No deadlock.
Thanks.
>
> Hope I'm not mistaken again. :)
>
> Thank.
> .