Re: [PATCH v2 1/2] mm: Allow non-VM_DONTEXPAND and VM_PFNMAP mappings with MREMAP_DONTUNMAP

From: Brian Geffon
Date: Wed Mar 17 2021 - 16:45:51 EST


Hi Peter,
Thank you as always for taking a look. This change relies on the
existing check in vma_to_resize on line 686:
https://elixir.bootlin.com/linux/v5.12-rc3/source/mm/mremap.c#L686
which returns -EFAULT when the vma is VM_DONTEXPAND or VM_PFNMAP.

Thanks
Brian

On Wed, Mar 17, 2021 at 4:40 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> Hi, Brian,
>
> On Wed, Mar 17, 2021 at 12:13:33PM -0700, Brian Geffon wrote:
> > Currently MREMAP_DONTUNMAP only accepts private anonymous mappings. This
> > change will widen the support to include any mappings which are not
> > VM_DONTEXPAND or VM_PFNMAP. The primary use case is to support
> > MREMAP_DONTUNMAP on mappings which may have been created from a memfd.
> >
> > This change which takes advantage of the existing check in vma_to_resize
> > for non-VM_DONTEXPAND and non-VM_PFNMAP mappings will cause
> > MREMAP_DONTUNMAP to return -EFAULT if such mappings are remapped. This
> > behavior is consistent with existing behavior when using mremap with
> > such mappings.
> >
> > Lokesh Gidra who works on the Android JVM, provided an explanation of how
> > such a feature will improve Android JVM garbage collection:
> > "Android is developing a new garbage collector (GC), based on userfaultfd.
> > The garbage collector will use userfaultfd (uffd) on the java heap during
> > compaction. On accessing any uncompacted page, the application threads will
> > find it missing, at which point the thread will create the compacted page
> > and then use UFFDIO_COPY ioctl to get it mapped and then resume execution.
> > Before starting this compaction, in a stop-the-world pause the heap will be
> > mremap(MREMAP_DONTUNMAP) so that the java heap is ready to receive
> > UFFD_EVENT_PAGEFAULT events after resuming execution.
> >
> > To speedup mremap operations, pagetable movement was optimized by moving
> > PUD entries instead of PTE entries [1]. It was necessary as mremap of even
> > modest sized memory ranges also took several milliseconds, and stopping the
> > application for that long isn't acceptable in response-time sensitive
> > cases.
> >
> > With UFFDIO_CONTINUE feature [2], it will be even more efficient to
> > implement this GC, particularly the 'non-moveable' portions of the heap.
> > It will also help in reducing the need to copy (UFFDIO_COPY) the pages.
> > However, for this to work, the java heap has to be on a 'shared' vma.
> > Currently MREMAP_DONTUNMAP only supports private anonymous mappings, this
> > patch will enable using UFFDIO_CONTINUE for the new userfaultfd-based heap
> > compaction."
> >
> > [1] https://lore.kernel.org/linux-mm/20201215030730.NC3CU98e4%25akpm@xxxxxxxxxxxxxxxxxxxx/
> > [2] https://lore.kernel.org/linux-mm/20210302000133.272579-1-axelrasmussen@xxxxxxxxxx/
> >
> > Signed-off-by: Brian Geffon <bgeffon@xxxxxxxxxx>
> > ---
> > mm/mremap.c | 4 ----
> > 1 file changed, 4 deletions(-)
> >
> > diff --git a/mm/mremap.c b/mm/mremap.c
> > index ec8f840399ed..2c57dc4bc8b6 100644
> > --- a/mm/mremap.c
> > +++ b/mm/mremap.c
> > @@ -653,10 +653,6 @@ static struct vm_area_struct *vma_to_resize(unsigned long addr,
> > return ERR_PTR(-EINVAL);
> > }
> >
> > - if (flags & MREMAP_DONTUNMAP && (!vma_is_anonymous(vma) ||
> > - vma->vm_flags & VM_SHARED))
> > - return ERR_PTR(-EINVAL);
> > -
> > if (is_vm_hugetlb_page(vma))
> > return ERR_PTR(-EINVAL);
>
> The code change seems to be not aligned with what the commit message said. Did
> you perhaps forget to add the checks against VM_DONTEXPAND | VM_PFNMAP? I'm
> guessing that (instead of commit message to be touched up) because you still
> attached the revert patch, then that check seems to be needed. Thanks,
>
> --
> Peter Xu
>