Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

From: Daniel Vetter
Date: Fri Oct 02 2020 - 14:17:02 EST


On Fri, Oct 2, 2020 at 8:06 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> On Fri, Oct 02, 2020 at 07:53:03PM +0200, Daniel Vetter wrote:
> > For $reasons I've stumbled over this code and I'm not sure the change
> > to the new gup functions in 55a650c35fea ("mm/gup: frame_vector:
> > convert get_user_pages() --> pin_user_pages()") was entirely correct.
> >
> > This here is used for long term buffers (not just quick I/O) like
> > RDMA, and John notes this in his patch. But I thought the rule for
> > these is that they need to add FOLL_LONGTERM, which John's patch
> > didn't do.
> >
> > There is already a dax specific check (added in b7f0554a56f2 ("mm:
> > fail get_vaddr_frames() for filesystem-dax mappings")), so this seems
> > like the prudent thing to do.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > Cc: John Hubbard <jhubbard@xxxxxxxxxx>
> > Cc: Jérôme Glisse <jglisse@xxxxxxxxxx>
> > Cc: Jan Kara <jack@xxxxxxx>
> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> > Cc: linux-mm@xxxxxxxxx
> > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> > Cc: linux-samsung-soc@xxxxxxxxxxxxxxx
> > Cc: linux-media@xxxxxxxxxxxxxxx
> > Hi all,
> >
> > I stumbled over this and figured typing this patch can't hurt. Really
> > just to maybe learn a few things about how gup/pup is supposed to be
> > used (we have a bit of that in drivers/gpu), this here isn't really
> > ralated to anything I'm doing.
>
> FOLL_FORCE is a pretty big clue it should be FOLL_LONGTERM, IMHO

Since you're here ... I've noticed that ib sets FOLL_FORCE when the ib
verb access mode indicates possible writes. I'm not really clear on
why FOLL_WRITE isn't enough any why you need to be able to write
through a vma that's write protected currently.

> > I'm also wondering whether the explicit dax check should be removed,
> > since FOLL_LONGTERM should take care of that already.
>
> Yep! Confirms the above!
>
> This get_vaddr_frames() thing looks impossible to use properly. How on
> earth does a driver guarentee
>
> "If @start belongs to VM_IO | VM_PFNMAP vma, we don't touch page
> structures and the caller must make sure pfns aren't reused for
> anything else while he is using them."
>
> The only possible way to do that is if the driver restricts the VMAs
> to ones it owns and interacts with the vm_private data to refcount
> something.
>
> Since every driver does this wrong anything that uses this is creating
> terrifying security issues.
>
> IMHO this whole API should be deleted :(

Yeah that part I just tried to conveniently ignore. I guess this dates
back to a time when ioremaps where at best fixed, and there wasn't
anything like a gpu driver dynamically managing vram around, resulting
in random entirely unrelated things possibly being mapped to that set
of pfns.

The underlying follow_pfn is also used in other places within
drivers/media, so this doesn't seem to be an accident, but actually
intentional.

I guess minimally we'd need a VM_PFNMAP flag for dynamically manged
drivers like modern drm gpu drivers, to make sure follow_pfn doesn't
follow these?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch