Re: [PATCH v2] mm/gup: Allow real explicit breaking of COW

From: Peter Xu
Date: Tue Aug 11 2020 - 11:28:06 EST


On Tue, Aug 11, 2020 at 08:05:22AM -0700, Linus Torvalds wrote:
> On Mon, Aug 10, 2020 at 4:39 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
> >
> > The way I understand Peter, he doesn't want to avoid doing COW; he
> > wants to decouple userfaultfd-WP's fault handling from COW, so that
> > userfaultfd-wp notifies only when a previously-write-protected page is
> > actually written to. In other words, he wants the COW to basically
> > happen as it happens now, but it should only create a readonly PTE;
> > and if someone later triggers a real write fault, the fault handling
> > path would run again, and this time userfaultfd-wp would be notified
> > before that readonly PTE is turned into a writable one.

Yes, that's what I meant. Appreciate your help, Jann.

>
> Ahh.
>
> A light goes on.
>
> Thank you.
>
> And apologies to Peter - I misread that patch entirely.
>
> That said, now that I (finally) understand what Peter wants to do, I
> don't think the patch does what you say.
>
> Because the GUP will now indeed avoid userfaultfd-wp unless it's
> _actually_ a write, but then any reads will cause a COW that turns
> things writable. There is no second fault.
>
> So now later writes will never cause any userfaultfd-wp notifications at all.
>
> Which for all I know might be acceptable and ok, but it seems to be
> against userfaultfd rules, and against the whole synchronization idea.
>
> So I think the patch is broken, but I'm less fundamentally concerned about it.
>
> Because at that point, it's "only" userfaultfd that might break.

Right, v2 is broken on that. That's why I pasted another chunk in my previous
reply to still inherit the UFFD_WP bit even for COW [1]. Previously it was not
needed because UFFD_WP must have been turned off for the pte/pmd/.. before COW
happens. However with enforced COW that's not guaranteed any more.

I'll post v3 soon.

[1] https://lore.kernel.org/lkml/20200810191520.GA132381@xz-x1/

Thanks,

--
Peter Xu