Re: [PATCH v5 05/15] mm/frame-vector: Use FOLL_LONGTERM

From: John Hubbard
Date: Fri Oct 30 2020 - 22:55:32 EST


On 10/30/20 3:08 AM, Daniel Vetter wrote:
This is used by media/videbuf2 for persistent dma mappings, not just
for a single dma operation and then freed again, so needs
FOLL_LONGTERM.

Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
locking issues. Rework the code to pull the pup path out from the
mmap_sem critical section as suggested by Jason.

By relying entirely on the vma checks in pin_user_pages and follow_pfn

There are vma checks in pin_user_pages(), but this patch changes things
to call pin_user_pages_fast(). And that does not have the vma checks.
More below about this:

(for vm_flags and vma_is_fsdax) we can also streamline the code a lot.

Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
Cc: Jason Gunthorpe <jgg@xxxxxxxx>
Cc: Pawel Osciak <pawel@xxxxxxxxxx>
Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
Cc: Kyungmin Park <kyungmin.park@xxxxxxxxxxx>
Cc: Tomasz Figa <tfiga@xxxxxxxxxxxx>
Cc: Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: John Hubbard <jhubbard@xxxxxxxxxx>
Cc: Jérôme Glisse <jglisse@xxxxxxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
Cc: linux-mm@xxxxxxxxx
Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
Cc: linux-samsung-soc@xxxxxxxxxxxxxxx
Cc: linux-media@xxxxxxxxxxxxxxx
Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
--
v2: Streamline the code and further simplify the loop checks (Jason)

v5: Review from Tomasz:
- fix page counting for the follow_pfn case by resetting ret
- drop gup_flags paramater, now unused
---
.../media/common/videobuf2/videobuf2-memops.c | 3 +-
include/linux/mm.h | 2 +-
mm/frame_vector.c | 53 ++++++-------------
3 files changed, 19 insertions(+), 39 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-memops.c b/drivers/media/common/videobuf2/videobuf2-memops.c
index 6e9e05153f4e..9dd6c27162f4 100644
--- a/drivers/media/common/videobuf2/videobuf2-memops.c
+++ b/drivers/media/common/videobuf2/videobuf2-memops.c
@@ -40,7 +40,6 @@ struct frame_vector *vb2_create_framevec(unsigned long start,
unsigned long first, last;
unsigned long nr;
struct frame_vector *vec;
- unsigned int flags = FOLL_FORCE | FOLL_WRITE;
first = start >> PAGE_SHIFT;
last = (start + length - 1) >> PAGE_SHIFT;
@@ -48,7 +47,7 @@ struct frame_vector *vb2_create_framevec(unsigned long start,
vec = frame_vector_create(nr);
if (!vec)
return ERR_PTR(-ENOMEM);
- ret = get_vaddr_frames(start & PAGE_MASK, nr, flags, vec);
+ ret = get_vaddr_frames(start & PAGE_MASK, nr, vec);
if (ret < 0)
goto out_destroy;
/* We accept only complete set of PFNs */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef360fe70aaf..d6b8e30dce2e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1765,7 +1765,7 @@ struct frame_vector {
struct frame_vector *frame_vector_create(unsigned int nr_frames);
void frame_vector_destroy(struct frame_vector *vec);
int get_vaddr_frames(unsigned long start, unsigned int nr_pfns,
- unsigned int gup_flags, struct frame_vector *vec);
+ struct frame_vector *vec);
void put_vaddr_frames(struct frame_vector *vec);
int frame_vector_to_pages(struct frame_vector *vec);
void frame_vector_to_pfns(struct frame_vector *vec);
diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index 10f82d5643b6..f8c34b895c76 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -32,13 +32,12 @@
* This function takes care of grabbing mmap_lock as necessary.
*/
int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
- unsigned int gup_flags, struct frame_vector *vec)
+ struct frame_vector *vec)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
int ret = 0;
int err;
- int locked;
if (nr_frames == 0)
return 0;
@@ -48,40 +47,26 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
start = untagged_addr(start);
- mmap_read_lock(mm);
- locked = 1;
- vma = find_vma_intersection(mm, start, start + 1);
- if (!vma) {
- ret = -EFAULT;
- goto out;
- }
-
- /*
- * While get_vaddr_frames() could be used for transient (kernel
- * controlled lifetime) pinning of memory pages all current
- * users establish long term (userspace controlled lifetime)
- * page pinning. Treat get_vaddr_frames() like
- * get_user_pages_longterm() and disallow it for filesystem-dax
- * mappings.
- */
- if (vma_is_fsdax(vma)) {
- ret = -EOPNOTSUPP;
- goto out;
- }
-
- if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {

By removing this check from this location, and changing from
pin_user_pages_locked() to pin_user_pages_fast(), I *think* we end up
losing the check entirely. Is that intended? If so it could use a comment
somewhere to explain why.

thanks,
--
John Hubbard
NVIDIA

+ ret = pin_user_pages_fast(start, nr_frames,
+ FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
+ (struct page **)(vec->ptrs));
+ if (ret > 0) {
vec->got_ref = true;
vec->is_pfns = false;
- ret = pin_user_pages_locked(start, nr_frames,
- gup_flags, (struct page **)(vec->ptrs), &locked);
- goto out;
+ goto out_unlocked;
}
+ mmap_read_lock(mm);
vec->got_ref = false;
vec->is_pfns = true;
+ ret = 0;
do {
unsigned long *nums = frame_vector_pfns(vec);
+ vma = find_vma_intersection(mm, start, start + 1);
+ if (!vma)
+ break;
+
while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
err = follow_pfn(vma, start, &nums[ret]);
if (err) {
@@ -92,17 +77,13 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
start += PAGE_SIZE;
ret++;
}
- /*
- * We stop if we have enough pages or if VMA doesn't completely
- * cover the tail page.
- */
- if (ret >= nr_frames || start < vma->vm_end)
+ /* Bail out if VMA doesn't completely cover the tail page. */
+ if (start < vma->vm_end)
break;
- vma = find_vma_intersection(mm, start, start + 1);
- } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
+ } while (ret < nr_frames);
out:
- if (locked)
- mmap_read_unlock(mm);
+ mmap_read_unlock(mm);
+out_unlocked:
if (!ret)
ret = -EFAULT;
if (ret > 0)