Re: [PATCH v2 3/3] userfaultfd: use per-vma locks in userfaultfd operations

From: Liam R. Howlett
Date: Wed Jan 31 2024 - 16:44:14 EST


* Suren Baghdasaryan <surenb@xxxxxxxxxx> [240130 22:03]:
> On Mon, Jan 29, 2024 at 6:58 PM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote:

..

> > > > > > @@ -730,7 +759,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
> > > > > >
> > > > > > out_unlock:
> > > > > > up_read(&ctx->map_changing_lock);
> > > > > > - mmap_read_unlock(dst_mm);
> > > > > > + unpin_vma(dst_mm, dst_vma, &mmap_locked);
> > > > > > out:
> > > > > > if (folio)
> > > > > > folio_put(folio);
> > > > > > @@ -1285,8 +1314,6 @@ static int validate_move_areas(struct userfaultfd_ctx *ctx,
> > > > > > * @len: length of the virtual memory range
> > > > > > * @mode: flags from uffdio_move.mode
> > > > > > *
> > > > > > - * Must be called with mmap_lock held for read.
> > > > > > - *
> > > > > > * move_pages() remaps arbitrary anonymous pages atomically in zero
> > > > > > * copy. It only works on non shared anonymous pages because those can
> > > > > > * be relocated without generating non linear anon_vmas in the rmap
> > > > > > @@ -1353,15 +1380,16 @@ static int validate_move_areas(struct userfaultfd_ctx *ctx,
> > > > > > * could be obtained. This is the only additional complexity added to
> > > > > > * the rmap code to provide this anonymous page remapping functionality.
> > > > > > */
> > > > > > -ssize_t move_pages(struct userfaultfd_ctx *ctx, struct mm_struct *mm,
> > > > > > - unsigned long dst_start, unsigned long src_start,
> > > > > > - unsigned long len, __u64 mode)
> > > > > > +ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start,
> > > > > > + unsigned long src_start, unsigned long len, __u64 mode)
> > > > > > {
> > > > > > + struct mm_struct *mm = ctx->mm;
> > > > > > struct vm_area_struct *src_vma, *dst_vma;
> > > > > > unsigned long src_addr, dst_addr;
> > > > > > pmd_t *src_pmd, *dst_pmd;
> > > > > > long err = -EINVAL;
> > > > > > ssize_t moved = 0;
> > > > > > + bool mmap_locked = false;
> > > > > >
> > > > > > /* Sanitize the command parameters. */
> > > > > > if (WARN_ON_ONCE(src_start & ~PAGE_MASK) ||
> > > > > > @@ -1374,28 +1402,52 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, struct mm_struct *mm,
> > > > > > WARN_ON_ONCE(dst_start + len <= dst_start))
> > > > > > goto out;
> > > > >
> > > > > Ah, is this safe for rmap? I think you need to leave this read lock.
> > > > >
> > > I didn't fully understand you here.
> >
> > Sorry, I'm confused on how your locking scheme avoids rmap from trying
> > to use the VMA with the atomic increment part.
>
> I'm also a bit confused. Which atomic increment are you referring to?
> AFAIU move_pages() will lock both src_vma and dst_vma, so even if rmap
> finds them it can't modify them, no?

The uffd atomic, mmap_changing.

..