Re: [External] Re: [PATCH] arm64: mm: hugetlb: add support for free vmemmap pages of HugeTLB

From: Muchun Song
Date: Wed May 19 2021 - 12:22:40 EST


On Wed, May 19, 2021 at 11:22 PM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote:
>
> On Wed, May 19, 2021 at 10:43 PM Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, May 19, 2021 at 8:35 PM Anshuman Khandual
> > <anshuman.khandual@xxxxxxx> wrote:
> > >
> > >
> > > On 5/18/21 2:48 PM, Muchun Song wrote:
> > > > The preparation of supporting freeing vmemmap associated with each
> > > > HugeTLB page is ready, so we can support this feature for arm64.
> > > >
> > > > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
> > > > ---
> > > > arch/arm64/mm/mmu.c | 5 +++++
> > > > fs/Kconfig | 2 +-
> > > > 2 files changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > > index 5d37e461c41f..967b01ce468d 100644
> > > > --- a/arch/arm64/mm/mmu.c
> > > > +++ b/arch/arm64/mm/mmu.c
> > > > @@ -23,6 +23,7 @@
> > > > #include <linux/mm.h>
> > > > #include <linux/vmalloc.h>
> > > > #include <linux/set_memory.h>
> > > > +#include <linux/hugetlb.h>
> > > >
> > > > #include <asm/barrier.h>
> > > > #include <asm/cputype.h>
> > > > @@ -1134,6 +1135,10 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
> > > > pmd_t *pmdp;
> > > >
> > > > WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));
> > > > +
> > > > + if (is_hugetlb_free_vmemmap_enabled() && !altmap)
> > > > + return vmemmap_populate_basepages(start, end, node, altmap);
> > > > +
> > > > do {
> > > > next = pmd_addr_end(addr, end);
> > > >
> > > > diff --git a/fs/Kconfig b/fs/Kconfig
> > > > index 6ce6fdac00a3..02c2d3bf1cb8 100644
> > > > --- a/fs/Kconfig
> > > > +++ b/fs/Kconfig
> > > > @@ -242,7 +242,7 @@ config HUGETLB_PAGE
> > > >
> > > > config HUGETLB_PAGE_FREE_VMEMMAP
> > > > def_bool HUGETLB_PAGE
> > > > - depends on X86_64
> > > > + depends on X86_64 || ARM64
> > > > depends on SPARSEMEM_VMEMMAP
> > > >
> > > > config MEMFD_CREATE
> > > >
> > >
> > > How does this interact with HugeTLB migration as such which might iterate
> > > over individual constituent struct pages (overriding the same struct page
> > > for all tail pages when this feature is enabled). A simple test involving
> > > madvise(ptr, size, MADV_SOFT_OFFLINE) fails on various HugeTLB page sizes,
> > > with this patch applied. Although I have not debugged this any further.
> >
> > It is weird. Actually, I didn't change the behaviour of the page migration.
> > This feature is default off. If you want to enable this feature, you can pass
> > "hugetlb_free_vmemmap=on" to the boot cmdline. Do you mean that the
> > success rate of page migration will decrease when you enable this feature?
> > The rate will increase if disbale. Right?
>
> I have done the test and found the issue. Because unmap_and_move_huge_page
> always returns -EBUSY. I will look into this issue in depth. Thanks for your
> report.
>
> The return point is as below:
>
> if (page_private(hpage) && !page_mapping(hpage)) {
> rc = -EBUSY;
> goto out_unlock;
> }

I know the issue. It was caused by commit d6995da31122 ("hugetlb:
use page.private for hugetlb specific page flags"). The below patch
can fix this issue.

diff --git a/mm/migrate.c b/mm/migrate.c
index e7a173da74ec..43419c4bb097 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1290,7 +1290,7 @@ static int unmap_and_move_huge_page(new_page_t
get_new_page,
* page_mapping() set, hugetlbfs specific move page routine will not
* be called and we could leak usage counts for subpools.
*/
- if (page_private(hpage) && !page_mapping(hpage)) {
+ if (hugetlb_page_subpool(hpage) && !page_mapping(hpage)) {
rc = -EBUSY;
goto out_unlock;
}

>
> >
> > Thanks.
> >
> >
> > >
> > > Soft offlining pfn 0x101c00 at process virtual address 0xffff7fa00000
> > > soft offline: 0x101c00: hugepage migration failed 1, type bfffc0000010006
> > > (referenced|uptodate|head|node=0|zone=2|lastcpupid=0xffff)
> > >