Re: [PATCH v2 2/5] mm: abstract the vma_merge()/split_vma() pattern for mprotect() et al.

From: Lorenzo Stoakes
Date: Wed Oct 11 2023 - 02:34:56 EST


On Tue, Oct 10, 2023 at 10:14:52PM -0400, Liam R. Howlett wrote:
> * Lorenzo Stoakes <lstoakes@xxxxxxxxx> [231009 16:53]:
> > mprotect() and other functions which change VMA parameters over a range
> > each employ a pattern of:-
> >
> > 1. Attempt to merge the range with adjacent VMAs.
> > 2. If this fails, and the range spans a subset of the VMA, split it
> > accordingly.
> >
> > This is open-coded and duplicated in each case. Also in each case most of
> > the parameters passed to vma_merge() remain the same.
> >
> > Create a new function, vma_modify(), which abstracts this operation,
> > accepting only those parameters which can be changed.
> >
> > To avoid the mess of invoking each function call with unnecessary
> > parameters, create inline wrapper functions for each of the modify
> > operations, parameterised only by what is required to perform the action.
> >
> > Note that the userfaultfd_release() case works even though it does not
> > split VMAs - since start is set to vma->vm_start and end is set to
> > vma->vm_end, the split logic does not trigger.
> >
> > In addition, since we calculate pgoff to be equal to vma->vm_pgoff + (start
> > - vma->vm_start) >> PAGE_SHIFT, and start - vma->vm_start will be 0 in this
> > instance, this invocation will remain unchanged.
> >
> > Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
> > ---
> > fs/userfaultfd.c | 69 +++++++++++++++-------------------------------
> > include/linux/mm.h | 60 ++++++++++++++++++++++++++++++++++++++++
> > mm/madvise.c | 32 ++++++---------------
> > mm/mempolicy.c | 22 +++------------
> > mm/mlock.c | 27 +++++-------------
> > mm/mmap.c | 45 ++++++++++++++++++++++++++++++
> > mm/mprotect.c | 35 +++++++----------------
> > 7 files changed, 157 insertions(+), 133 deletions(-)
> >
> > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > index a7c6ef764e63..ba44a67a0a34 100644
> > --- a/fs/userfaultfd.c
> > +++ b/fs/userfaultfd.c
> > @@ -927,11 +927,10 @@ static int userfaultfd_release(struct inode *inode, struct file *file)
> > continue;
> > }
> > new_flags = vma->vm_flags & ~__VM_UFFD_FLAGS;
> > - prev = vma_merge(&vmi, mm, prev, vma->vm_start, vma->vm_end,
> > - new_flags, vma->anon_vma,
> > - vma->vm_file, vma->vm_pgoff,
> > - vma_policy(vma),
> > - NULL_VM_UFFD_CTX, anon_vma_name(vma));
> > + prev = vma_modify_flags_uffd(&vmi, prev, vma, vma->vm_start,
> > + vma->vm_end, new_flags,
> > + NULL_VM_UFFD_CTX);
> > +
> > if (prev) {
> > vma = prev;
> > } else {
> > @@ -1331,7 +1330,6 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> > unsigned long start, end, vma_end;
> > struct vma_iterator vmi;
> > bool wp_async = userfaultfd_wp_async_ctx(ctx);
> > - pgoff_t pgoff;
> >
> > user_uffdio_register = (struct uffdio_register __user *) arg;
> >
> > @@ -1484,28 +1482,17 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
> > vma_end = min(end, vma->vm_end);
> >
> > new_flags = (vma->vm_flags & ~__VM_UFFD_FLAGS) | vm_flags;
> > - pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT);
> > - prev = vma_merge(&vmi, mm, prev, start, vma_end, new_flags,
> > - vma->anon_vma, vma->vm_file, pgoff,
> > - vma_policy(vma),
> > - ((struct vm_userfaultfd_ctx){ ctx }),
> > - anon_vma_name(vma));
> > - if (prev) {
> > - /* vma_merge() invalidated the mas */
> > - vma = prev;
> > - goto next;
> > - }
> > - if (vma->vm_start < start) {
> > - ret = split_vma(&vmi, vma, start, 1);
> > - if (ret)
> > - break;
> > - }
> > - if (vma->vm_end > end) {
> > - ret = split_vma(&vmi, vma, end, 0);
> > - if (ret)
> > - break;
> > + prev = vma_modify_flags_uffd(&vmi, prev, vma, start, vma_end,
> > + new_flags,
> > + (struct vm_userfaultfd_ctx){ctx});
> > + if (IS_ERR(prev)) {
> > + ret = PTR_ERR(prev);
> > + break;
> > }
> > - next:
> > +
> > + if (prev)
> > + vma = prev; /* vma_merge() invalidated the mas */
>
> This is a stale comment. The maple state is in the vma iterator, which
> is passed through. I missed this on the vma iterator conversion.

Ack, this was coincidentally removed in v3 so this is already resolved.