Re: [PATCH v2 10/26] userfaultfd: wp: add UFFDIO_COPY_MODE_WP

From: Peter Xu
Date: Mon Feb 25 2019 - 01:46:03 EST


On Fri, Feb 22, 2019 at 10:15:47AM -0500, Jerome Glisse wrote:
> On Fri, Feb 22, 2019 at 03:11:06PM +0800, Peter Xu wrote:
> > On Thu, Feb 21, 2019 at 12:29:19PM -0500, Jerome Glisse wrote:
> > > On Tue, Feb 12, 2019 at 10:56:16AM +0800, Peter Xu wrote:
> > > > From: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > > >
> > > > This allows UFFDIO_COPY to map pages wrprotected.
> > > >
> > > > Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > > > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
> > >
> > > Minor nitpick down below, but in any case:
> > >
> > > Reviewed-by: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
> > >
> > > > ---
> > > > fs/userfaultfd.c | 5 +++--
> > > > include/linux/userfaultfd_k.h | 2 +-
> > > > include/uapi/linux/userfaultfd.h | 11 +++++-----
> > > > mm/userfaultfd.c | 36 ++++++++++++++++++++++----------
> > > > 4 files changed, 35 insertions(+), 19 deletions(-)
> > > >
> > >
> > > [...]
> > >
> > > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> > > > index d59b5a73dfb3..73a208c5c1e7 100644
> > > > --- a/mm/userfaultfd.c
> > > > +++ b/mm/userfaultfd.c
> > > > @@ -25,7 +25,8 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm,
> > > > struct vm_area_struct *dst_vma,
> > > > unsigned long dst_addr,
> > > > unsigned long src_addr,
> > > > - struct page **pagep)
> > > > + struct page **pagep,
> > > > + bool wp_copy)
> > > > {
> > > > struct mem_cgroup *memcg;
> > > > pte_t _dst_pte, *dst_pte;
> > > > @@ -71,9 +72,9 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm,
> > > > if (mem_cgroup_try_charge(page, dst_mm, GFP_KERNEL, &memcg, false))
> > > > goto out_release;
> > > >
> > > > - _dst_pte = mk_pte(page, dst_vma->vm_page_prot);
> > > > - if (dst_vma->vm_flags & VM_WRITE)
> > > > - _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte));
> > > > + _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot));
> > > > + if (dst_vma->vm_flags & VM_WRITE && !wp_copy)
> > > > + _dst_pte = pte_mkwrite(_dst_pte);
> > >
> > > I like parenthesis around around and :) ie:
> > > (dst_vma->vm_flags & VM_WRITE) && !wp_copy
> > >
> > > I feel it is easier to read.
> >
> > Yeah another one. Though this line will be changed in follow up
> > patches, will fix anyways.
> >
> > >
> > > [...]
> > >
> > > > @@ -416,11 +418,13 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm,
> > > > if (!(dst_vma->vm_flags & VM_SHARED)) {
> > > > if (!zeropage)
> > > > err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
> > > > - dst_addr, src_addr, page);
> > > > + dst_addr, src_addr, page,
> > > > + wp_copy);
> > > > else
> > > > err = mfill_zeropage_pte(dst_mm, dst_pmd,
> > > > dst_vma, dst_addr);
> > > > } else {
> > > > + VM_WARN_ON(wp_copy); /* WP only available for anon */
> > >
> > > Don't you want to return with error here ?
> >
> > Makes sense to me. Does this looks good to you to be squashed into
> > current patch?
> >
> > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> > index 73a208c5c1e7..f3ea09f412d4 100644
> > --- a/mm/userfaultfd.c
> > +++ b/mm/userfaultfd.c
> > @@ -73,7 +73,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm,
> > goto out_release;
> >
> > _dst_pte = pte_mkdirty(mk_pte(page, dst_vma->vm_page_prot));
> > - if (dst_vma->vm_flags & VM_WRITE && !wp_copy)
> > + if ((dst_vma->vm_flags & VM_WRITE) && !wp_copy)
> > _dst_pte = pte_mkwrite(_dst_pte);
> >
> > dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
> > @@ -424,7 +424,10 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm,
> > err = mfill_zeropage_pte(dst_mm, dst_pmd,
> > dst_vma, dst_addr);
> > } else {
> > - VM_WARN_ON(wp_copy); /* WP only available for anon */
> > + if (unlikely(wp_copy))
> > + /* TODO: WP currently only available for anon */
> > + return -EINVAL;
> > +
> > if (!zeropage)
> > err = shmem_mcopy_atomic_pte(dst_mm, dst_pmd,
> > dst_vma, dst_addr,
>
> I would keep a the VM_WARN_ON or maybe a ONCE variant so that we at
> least have a chance to be inform if for some reasons that code path
> is taken. With that my r-b stands.

Yeah *ONCE() is good to me too (both can avoid DOS attack from
userspace) and I don't have strong opinion on whether we should fail
on this specific ioctl if it happens. For now I'll just take the
advise and the r-b together. Thanks,

--
Peter Xu