Re: [v2 PATCH 3/3] mm: hwpoison: dump page for unhandlable page

From: HORIGUCHI NAOYA(堀口 直也)
Date: Fri Aug 20 2021 - 02:48:44 EST


On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> Currently just very simple message is shown for unhandlable page, e.g.
> non-LRU page, like:
> soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
>
> It is not very helpful for further debug, calling dump_page() could show
> more useful information.
>
> Calling dump_page() in get_any_page() in order to not duplicate the call
> in a couple of different places. It may be called with pcp disabled and
> holding memory hotplug lock, it should be not a big deal since hwpoison
> handler is not called very often.
>
> Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> Cc: Oscar Salvador <osalvador@xxxxxxx>
> Signed-off-by: Yang Shi <shy828301@xxxxxxxxx>
> ---
> mm/memory-failure.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 7cfa134b1370..60df8fcd0444 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1228,6 +1228,9 @@ static int get_any_page(struct page *p, unsigned long flags)
> ret = -EIO;
> }
> out:
> + if (ret == -EIO)
> + dump_page(p, "hwpoison: unhandlable page");
> +

I feel that 4 callers of get_hwpoison_page() are in the different context,
so it might be better to consider them separately to add dump_page() or not.
soft_offline_page() still prints out "%s: %#lx: unknown page type: %lx (%pGp)"
message, which might be duplicate so this printk() may be dropped.
In memory_failure_hugetlb() and memory_failure(), we can call dump_page() after
action_result(). unpoison_memory() doesn't need dump_page() at all because
it's related to already hwpoisoned page.

Thanks,
Naoya Horiguchi