Re: [PATCH] mm/memory-failure: fix deadlock when hugetlb_optimize_vmemmap is enabled

From: Oscar Salvador
Date: Tue Apr 09 2024 - 12:10:32 EST


On Tue, Apr 09, 2024 at 04:10:22PM +0200, Oscar Salvador wrote:
> On Sun, Apr 07, 2024 at 04:54:56PM +0800, Miaohe Lin wrote:
> > In short, below scene breaks the lock dependency chain:
> >
> > memory_failure
> > __page_handle_poison
> > zone_pcp_disable -- lock(pcp_batch_high_lock)
> > dissolve_free_huge_page
> > __hugetlb_vmemmap_restore_folio
> > static_key_slow_dec
> > cpus_read_lock -- rlock(cpu_hotplug_lock)
> >
> > Fix this by calling drain_all_pages() instead.
> >
> > Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
>
> Acked-by: Oscar Salvador <osalvador@xxxxxxx>

On a second though,

disabling pcp via zone_pcp_disable() was a deterministic approach.
Now, with drain_all_pages() we drain PCP queues to buddy, but nothing
guarantees that those pages do not end up in a PCP queue again before we
the call to take_page_off_budy() if we
need refilling, right?

I guess we can live with that because we will let the system know that we
failed to isolate that page.


--
Oscar Salvador
SUSE Labs