Re: [PATCH v5 08/11] mm: hwpoison: soft offline supports thp migration
From: Zi Yan
Date: Thu Apr 27 2017 - 12:40:19 EST
Naoya Horiguchi wrote:
> On Fri, Apr 21, 2017 at 10:55:49AM -0500, Zi Yan wrote:
>>
>> Anshuman Khandual wrote:
>>> On 04/21/2017 02:17 AM, Zi Yan wrote:
>>>> From: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
>>>>
>>>> This patch enables thp migration for soft offline.
>>>>
>>>> Signed-off-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
>>>>
>>>> ChangeLog: v1 -> v5:
>>>> - fix page isolation counting error
>>>>
>>>> Signed-off-by: Zi Yan <zi.yan@xxxxxxxxxxxxxx>
>>>> ---
>>>> mm/memory-failure.c | 35 ++++++++++++++---------------------
>>>> 1 file changed, 14 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 9b77476ef31f..23ff02eb3ed4 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -1481,7 +1481,17 @@ static struct page *new_page(struct page *p, unsigned long private, int **x)
>>>> if (PageHuge(p))
>>>> return alloc_huge_page_node(page_hstate(compound_head(p)),
>>>> nid);
>>>> - else
>>>> + else if (thp_migration_supported() && PageTransHuge(p)) {
>>>> + struct page *thp;
>>>> +
>>>> + thp = alloc_pages_node(nid,
>>>> + (GFP_TRANSHUGE | __GFP_THISNODE) & ~__GFP_RECLAIM,
>>> Why not __GFP_RECLAIM ? Its soft offline path we wait a bit before
>>> declaring that THP page cannot be allocated and hence should invoke
>>> reclaim methods as well.
>> I am not sure how much effort the kernel wants to put here to soft
>> offline a THP. Naoya knows more here.
>
> What I thought at first was that soft offline is not an urgent user
> and no need to reclaim (i.e. give a little impact on other thread.)
> But that's not a strong opinion, so if you like __GFP_RECLAIM here,
> I'm fine about that.
OK, I will add __GFP_RECLAIM.
>
>>
>>>> + HPAGE_PMD_ORDER);
>>>> + if (!thp)
>>>> + return NULL;
>>>> + prep_transhuge_page(thp);
>>>> + return thp;
>>>> + } else
>>>> return __alloc_pages_node(nid, GFP_HIGHUSER_MOVABLE, 0);
>>>> }
>>>>
>>>> @@ -1665,8 +1675,8 @@ static int __soft_offline_page(struct page *page, int flags)
>>>> * cannot have PAGE_MAPPING_MOVABLE.
>>>> */
>>>> if (!__PageMovable(page))
>>>> - inc_node_page_state(page, NR_ISOLATED_ANON +
>>>> - page_is_file_cache(page));
>>>> + mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON +
>>>> + page_is_file_cache(page), hpage_nr_pages(page));
>>>> list_add(&page->lru, &pagelist);
>>>> ret = migrate_pages(&pagelist, new_page, NULL, MPOL_MF_MOVE_ALL,
>>>> MIGRATE_SYNC, MR_MEMORY_FAILURE);
>>>> @@ -1689,28 +1699,11 @@ static int __soft_offline_page(struct page *page, int flags)
>>>> static int soft_offline_in_use_page(struct page *page, int flags)
>>>> {
>>>> int ret;
>>>> - struct page *hpage = compound_head(page);
>>>> -
>>>> - if (!PageHuge(page) && PageTransHuge(hpage)) {
>>>> - lock_page(hpage);
>>>> - if (!PageAnon(hpage) || unlikely(split_huge_page(hpage))) {
>>>> - unlock_page(hpage);
>>>> - if (!PageAnon(hpage))
>>>> - pr_info("soft offline: %#lx: non anonymous thp\n", page_to_pfn(page));
>>>> - else
>>>> - pr_info("soft offline: %#lx: thp split failed\n", page_to_pfn(page));
>>>> - put_hwpoison_page(hpage);
>>>> - return -EBUSY;
>>>> - }
>>>> - unlock_page(hpage);
>>>> - get_hwpoison_page(page);
>>>> - put_hwpoison_page(hpage);
>>>> - }
>>>>
>>>> if (PageHuge(page))
>>>> ret = soft_offline_huge_page(page, flags);
>>>> else
>>>> - ret = __soft_offline_page(page, flags);
>>>> + ret = __soft_offline_page(compound_head(page), flags);
>>> Hmm, what if the THP allocation fails in the new_page() path and
>>> we fallback for general page allocation. In that case we will
>>> always be still calling with the head page ? Because we dont
>>> split the huge page any more.
>> This could be a problem if the user wants to offline a TailPage but due
>> to THP allocation failure, the HeadPage is offlined.
>
> Right, "retry with split" part is unfinished, so we need some improvement.
>
>> It may be better to only soft offline THPs if page ==
>> compound_head(page). If page != compound_head(page), we still split THPs
>> like before.
>>
>> Because in migrate_pages(), we cannot guarantee any TailPages in that
>> THP are migrated (1. THP allocation failure causes THP splitting, then
>> only HeadPage is going to be migrated; 2. even if we change existing
>> migrate_pages() implementation to add all TailPages to migration list
>> instead of LRU list, we still cannot guarantee the TailPage we want to
>> migrate is migrated.).
>>
>> Naoya, what do you think?
>
> Maybe soft offline is a special caller of page migration because it
> basically wants to migrate only one page, but thp migration still has
> a benefit because we can avoid thp split.
> So I like that we try thp migration at first, and if it fails we fall
> back to split and migrate (only) a raw error page. This should be done
> in caller side for soft offline, because it knows where the error page is.
Make sense. So when migrate_pages() sees the migrate reason is
MR_MEMORY_FAILURE, it will not split THP when newpage allocation fails.
Then, the soft offline caller will split failed THP and retry migrating
the error subpage. I can do that.
>
> As for generic case (for other migration callers which mainly want to
> migrate multiple pages for their purpose,) thp split and retry can be
> done in common migration code. After thp split, all subpages are linked
> to migration list, then we retry without returning to the caller.
> So I think that split_huge_page() can be moved to (for example) for-loop
> in migrate_pages().
>
> I tried to write a patch for it last year, but considering vm event
> accounting, the patch might be large (~100 lines).
Yes. I saw your code on your github. I can pick it up and send it for
review after this patchset is merged, if you are OK with it.
--
Best Regards,
Yan Zi
Attachment:
signature.asc
Description: OpenPGP digital signature