Re: [PATCH RFC 06/24] userfaultfd: wp: support write protection for userfault vma range
From: Peter Xu
Date: Thu Jan 24 2019 - 00:47:31 EST
On Tue, Jan 22, 2019 at 09:43:38PM -0500, Jerome Glisse wrote:
> On Wed, Jan 23, 2019 at 10:17:45AM +0800, Peter Xu wrote:
> > On Tue, Jan 22, 2019 at 12:02:24PM -0500, Jerome Glisse wrote:
> > > On Tue, Jan 22, 2019 at 05:39:35PM +0800, Peter Xu wrote:
> > > > On Mon, Jan 21, 2019 at 09:05:35AM -0500, Jerome Glisse wrote:
> > > >
> > > > [...]
> > > >
> > > > > > + change_protection(dst_vma, start, start + len, newprot,
> > > > > > + !enable_wp, 0);
> > > > >
> > > > > So setting dirty_accountable bring us to that code in mprotect.c:
> > > > >
> > > > > if (dirty_accountable && pte_dirty(ptent) &&
> > > > > (pte_soft_dirty(ptent) ||
> > > > > !(vma->vm_flags & VM_SOFTDIRTY))) {
> > > > > ptent = pte_mkwrite(ptent);
> > > > > }
> > > > >
> > > > > My understanding is that you want to set write flag when enable_wp
> > > > > is false and you want to set the write flag unconditionaly, right ?
> > > >
> > > > Right.
> > > >
> > > > >
> > > > > If so then you should really move the change_protection() flags
> > > > > patch before this patch and add a flag for setting pte write flags.
> > > > >
> > > > > Otherwise the above is broken at it will only set the write flag
> > > > > for pte that were dirty and i am guessing so far you always were
> > > > > lucky because pte were all dirty (change_protection will preserve
> > > > > dirtyness) when you write protected them.
> > > > >
> > > > > So i believe the above is broken or at very least unclear if what
> > > > > you really want is to only set write flag to pte that have the
> > > > > dirty flag set.
> > > >
> > > > You are right, if we build the tree until this patch it won't work for
> > > > all the cases. It'll only work if the page was at least writable
> > > > before and also it's dirty (as you explained). Sorry to be unclear
> > > > about this, maybe I should at least mention that in the commit message
> > > > but I totally forgot it.
> > > >
> > > > All these problems are solved in later on patches, please feel free to
> > > > have a look at:
> > > >
> > > > mm: merge parameters for change_protection()
> > > > userfaultfd: wp: apply _PAGE_UFFD_WP bit
> > > > userfaultfd: wp: handle COW properly for uffd-wp
> > > >
> > > > Note that even in the follow up patches IMHO we can't directly change
> > > > the write permission since the page can be shared by other processes
> > > > (e.g., the zero page or COW pages). But the general idea is the same
> > > > as you explained.
> > > >
> > > > I tried to avoid squashing these stuff altogether as explained
> > > > previously. Also, this patch can be seen as a standalone patch to
> > > > introduce the new interface which seems to make sense too, and it is
> > > > indeed still working in many cases so I see the latter patches as
> > > > enhancement of this one. Please let me know if you still want me to
> > > > have all these stuff squashed, or if you'd like me to squash some of
> > > > them.
> > >
> > > Yeah i have look at those after looking at this one. You should just
> > > re-order the patch this one first and then one that add new flag,
> > > then ones that add the new userfaultfd feature. Otherwise you are
> > > adding a userfaultfd feature that is broken midway ie it is added
> > > broken and then you fix it. Some one bisecting thing might get hurt
> > > by that. It is better to add and change everything you need and then
> > > add the new feature so that the new feature will work as intended.
> > >
> > > So no squashing just change the order ie add the userfaultfd code
> > > last.
> >
> > Yes this makes sense, I'll do that in v2. Thanks for the suggestion!
>
> Note before doing a v2 i would really like to see some proof of why
> you need new page table flag see my reply to:
> userfaultfd: wp: add WP pagetable tracking to x86
>
> As i believe you can identify COW or KSM from UFD write protect with-
> out a pte flag.
Yes. I replied in that thread with my understanding on why the new
bit is required in the PTE (and also another new bit in the swap
entry). We can discuss there.
Thanks,
--
Peter Xu