Re: [PATCH] mm: hwpoison: fix thp split handing in soft_offline_in_use_page()

From: Michal Hocko
Date: Tue Feb 26 2019 - 08:25:47 EST


[Cc Kirril for the THP side]

On Tue 26-02-19 19:18:00, zhong jiang wrote:
> From: zhongjiang <zhongjiang@xxxxxxxxxx>
>
> When soft_offline_in_use_page() runs on a thp tail page after pmd is plit,
> we trigger the following VM_BUG_ON_PAGE():
>
> Memory failure: 0x3755ff: non anonymous thp
> __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000
> Soft offlining pfn 0x34d805 at process virtual address 0x20fff000
> page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1
> flags: 0x2fffff80000000()
> raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000
> raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
> ------------[ cut here ]------------
> kernel BUG at ./include/linux/mm.h:519!
>
> soft_offline_in_use_page() passed refcount and page lock from tail page to
> head page, which is not needed because we can pass any subpage to
> split_huge_page().
>
> Cc: <stable@xxxxxxxxxxxxxxx> [4.5+]
> Signed-off-by: zhongjiang <zhongjiang@xxxxxxxxxx>
> ---
> mm/memory-failure.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index d9b8a24..6edc6db 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1823,19 +1823,17 @@ static int soft_offline_in_use_page(struct page *page, int flags)
> struct page *hpage = compound_head(page);
>
> if (!PageHuge(page) && PageTransHuge(hpage)) {
> - lock_page(hpage);
> - if (!PageAnon(hpage) || unlikely(split_huge_page(hpage))) {
> - unlock_page(hpage);
> - if (!PageAnon(hpage))
> + lock_page(page);
> + if (!PageAnon(page) || unlikely(split_huge_page(page))) {
> + unlock_page(page);
> + if (!PageAnon(page))
> pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page));
> else
> pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page));
> - put_hwpoison_page(hpage);
> + put_hwpoison_page(page);
> return -EBUSY;
> }
> - unlock_page(hpage);
> - get_hwpoison_page(page);
> - put_hwpoison_page(hpage);
> + unlock_page(page);
> }
>
> /*
> --
> 1.7.12.4
>

--
Michal Hocko
SUSE Labs