Re: [PATCH 38/94] mm/gup: Add mm_populate_vma() for use when the vma is known

From: Liam Howlett
Date: Mon May 03 2021 - 19:06:14 EST


* Matthew Wilcox <willy@xxxxxxxxxxxxx> [210503 12:02]:
> On Mon, May 03, 2021 at 03:53:58PM +0000, Liam Howlett wrote:
> > * Michel Lespinasse <michel@xxxxxxxxxxxxxx> [210501 01:13]:
> > > On Wed, Apr 28, 2021 at 03:36:08PM +0000, Liam Howlett wrote:
> > > > When a vma is known, avoid calling mm_populate to search for the vma to
> > > > populate.
> > > >
> > > > Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx>
> > > > ---
> > > > mm/gup.c | 20 ++++++++++++++++++++
> > > > mm/internal.h | 4 ++++
> > > > 2 files changed, 24 insertions(+)
> > > >
> > > > diff --git a/mm/gup.c b/mm/gup.c
> > > > index c3a17b189064..48fe98ab0729 100644
> > > > --- a/mm/gup.c
> > > > +++ b/mm/gup.c
> > > > @@ -1468,6 +1468,26 @@ long populate_vma_page_range(struct vm_area_struct *vma,
> > > > NULL, NULL, locked);
> > > > }
> > > >
> > > > +/*
> > > > + * mm_populate_vma() - Populate a single range in a single vma.
> > > > + * @vma: The vma to populate.
> > > > + * @start: The start address to populate
> > > > + * @end: The end address to stop populating
> > > > + *
> > > > + * Note: Ignores errors.
> > > > + */
> > > > +void mm_populate_vma(struct vm_area_struct *vma, unsigned long start,
> > > > + unsigned long end)
> > > > +{
> > > > + struct mm_struct *mm = current->mm;
> > > > + int locked = 1;
> > > > +
> > > > + mmap_read_lock(mm);
> > > > + populate_vma_page_range(vma, start, end, &locked);
> > > > + if (locked)
> > > > + mmap_read_unlock(mm);
> > > > +}
> > > > +
> > >
> > > This seems like a nonsensical API at first glance - VMAs that are found
> > > in the vma tree might be modified, merged, split, or freed at any time
> > > if the mmap lock is not held, so the API can not be safely used. I think
> > > this applies to maple tree vmas just as much as it did for rbtree vmas ?
> >
> > This is correct - it cannot be used without having the mmap_sem lock.
> > This is a new internal mm code API and is used to avoid callers that use
> > mm_populate() on a range that is known to be in a single VMA and already
> > have that VMA. So instead of re-walking the tree to re-find the VMAs,
> > this function can be used with the known VMA and range.
> >
> > It is used as described in patch 39 and 40 of this series.
>
> In patch 39, what you do is:
>
> 1 Take the mmap_sem for write
> 2 do stuff
> 3 Drop the mmap_sem
> 4 Call mm_populate_vm() with the vma, which takes the mmap_sem
> for read
>
> The problem is that between 3 & 4, a racing thread might cause us to free
> the vma and so we've now passed a bogus pointer into mm_populate_vm().
>
> What we need instead is to downgrade the mmap_sem from write to read at
> step 3, so the vma is guaranteed to still be good.

Thank you. I will remove these patches from the series and work on this
idea.

Regards,
Liam