Re: [PATCH 4/5] mm: Do early cow for pinned pages during fork() for ptes

From: Jann Horn
Date: Mon Sep 21 2020 - 17:55:37 EST


On Mon, Sep 21, 2020 at 11:20 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
> This patch is greatly inspired by the discussions on the list from Linus, Jason
> Gunthorpe and others [1].
>
> It allows copy_pte_range() to do early cow if the pages were pinned on the
> source mm. Currently we don't have an accurate way to know whether a page is
> pinned or not. The only thing we have is page_maybe_dma_pinned(). However
> that's good enough for now. Especially, with the newly added mm->has_pinned
> flag to make sure we won't affect processes that never pinned any pages.

To clarify: This patch only handles pin_user_pages() callers and
doesn't try to address other GUP users, right? E.g. if task A uses
process_vm_write() on task B while task B is going through fork(),
that can still race in such a way that the written data only shows up
in the child and not in B, right?

I dislike the whole pin_user_pages() concept because (as far as I
understand) it fundamentally tries to fix a problem in the subset of
cases that are more likely to occur in practice (long-term pins
overlapping with things like writeback), and ignores the rarer cases
("short-term" GUP).