Re: [PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM

From: Daniel Vetter
Date: Wed Oct 07 2020 - 17:30:15 EST


On Wed, Oct 7, 2020 at 11:13 PM John Hubbard <jhubbard@xxxxxxxxxx> wrote:
>
> On 10/7/20 9:44 AM, Daniel Vetter wrote:
> > This is used by media/videbuf2 for persistent dma mappings, not just
> > for a single dma operation and then freed again, so needs
> > FOLL_LONGTERM.
> >
> > Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
> > locking issues. Rework the code to pull the pup path out from the
> > mmap_sem critical section as suggested by Jason.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> > Cc: Jason Gunthorpe <jgg@xxxxxxxx>
> > Cc: Pawel Osciak <pawel@xxxxxxxxxx>
> > Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> > Cc: Kyungmin Park <kyungmin.park@xxxxxxxxxxx>
> > Cc: Tomasz Figa <tfiga@xxxxxxxxxxxx>
> > Cc: Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > Cc: John Hubbard <jhubbard@xxxxxxxxxx>
> > Cc: Jérôme Glisse <jglisse@xxxxxxxxxx>
> > Cc: Jan Kara <jack@xxxxxxx>
> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> > Cc: linux-mm@xxxxxxxxx
> > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> > Cc: linux-samsung-soc@xxxxxxxxxxxxxxx
> > Cc: linux-media@xxxxxxxxxxxxxxx
> > ---
> > mm/frame_vector.c | 36 +++++++++++-------------------------
> > 1 file changed, 11 insertions(+), 25 deletions(-)
> >
> > diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> > index 10f82d5643b6..39db520a51dc 100644
> > --- a/mm/frame_vector.c
> > +++ b/mm/frame_vector.c
> > @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> > struct vm_area_struct *vma;
> > int ret = 0;
> > int err;
> > - int locked;
> >
> > if (nr_frames == 0)
> > return 0;
> > @@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> >
> > start = untagged_addr(start);
> >
> > + ret = pin_user_pages_fast(start, nr_frames,
> > + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
> > + (struct page **)(vec->ptrs));
> > + if (ret > 0) {
> > + vec->got_ref = true;
> > + vec->is_pfns = false;
> > + goto out_unlocked;
> > + }
>
> This part looks good, and changing to _fast is a potential performance improvement,
> too.
>
> > +
> > mmap_read_lock(mm);
> > - locked = 1;
> > vma = find_vma_intersection(mm, start, start + 1);
> > if (!vma) {
> > ret = -EFAULT;
> > goto out;
> > }
> >
> > - /*
> > - * While get_vaddr_frames() could be used for transient (kernel
> > - * controlled lifetime) pinning of memory pages all current
> > - * users establish long term (userspace controlled lifetime)
> > - * page pinning. Treat get_vaddr_frames() like
> > - * get_user_pages_longterm() and disallow it for filesystem-dax
> > - * mappings.
> > - */
> > - if (vma_is_fsdax(vma)) {
> > - ret = -EOPNOTSUPP;
> > - goto out;
> > - }
>
> Are you sure we don't need to check vma_is_fsdax() anymore?

Since FOLL_LONGTERM checks for this and can only return struct page
backed memory, and explicitly excludes VM_IO | VM_PFNMAP, was assuming
this is not needed for follow_pfn. And the get_user_pages_locked this
used back then didn't have the same check, hence why it was added (and
FOLL_LONGTERM still doesn't work for the _locked versions, as you
pointed out on the last round of this discussion).

But now that you're asking, I have no idea whether fsdax vma can also
be of VM_IO | VM_PFNMAP type. I'm not seeing that set anywhere in
fs/dax.c, but that says nothing :-)

Dan, you added this check originally, do we need it for VM_SPECIAL vmas too?

Thanks, Daniel

>
> > -
> > - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
> > - vec->got_ref = true;
> > - vec->is_pfns = false;
> > - ret = pin_user_pages_locked(start, nr_frames,
> > - gup_flags, (struct page **)(vec->ptrs), &locked);
> > - goto out;
> > - }
> > -
> > vec->got_ref = false;
> > vec->is_pfns = true;
> > do {
> > @@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> > vma = find_vma_intersection(mm, start, start + 1);
> > } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
> > out:
> > - if (locked)
> > - mmap_read_unlock(mm);
> > + mmap_read_unlock(mm);
> > +out_unlocked:
> > if (!ret)
> > ret = -EFAULT;
> > if (ret > 0)
> >
>
> All of the error handling still looks accurate there.
>
> thanks,
> --
> John Hubbard
> NVIDIA



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch