[PATCH v2 0/4] mm: Break COW for pinned pages during fork()

From: Peter Xu
Date: Fri Sep 25 2020 - 18:26:02 EST


Due to the rebase to latest rc6, the major pte copy patch changed a lot. So
maybe not that useful to write a changelog any more. However all the comments
should be addressed as long as discussed in previous thread. Please shoot if I
missed anything important.

This series is majorly inspired by the previous discussion on the list [1],
starting from the report from Jason on the rdma test failure. Linus proposed
the solution, which seems to be a very nice approach to avoid the breakage of
userspace apps that didn't use MADV_DONTFORK properly before. More information
can be found in that thread too.

I tested it myself with fork() after vfio pinning a bunch of device pages, and
I verified that the new copy pte logic worked as expected at least in the most
general path. However I didn't test thp case yet because afaict vfio does not
support thp backed dma pages. Luckily, the pmd/pud thp patch is much more
straightforward than the pte one, so hopefully it can be directly verified by
some code review plus some more heavy-weight rdma tests.

Patch 1: Introduce mm.has_pinned
Patch 2: Preparation patch
Patch 3: Early cow solution for pte copy for pinned pages
Patch 4: Same as above, but for thp (pmd/pud).

Hugetlbfs fix is still missing, but as planned, that's not urgent so we can
work upon. Comments greatly welcomed.

[1] https://lore.kernel.org/lkml/20200914143829.GA1424636@xxxxxxxxxx/

Thanks.

Peter Xu (4):
mm: Introduce mm_struct.has_pinned
mm/fork: Pass new vma pointer into copy_page_range()
mm: Do early cow for pinned pages during fork() for ptes
mm/thp: Split huge pmds/puds if they're pinned when fork()

include/linux/mm.h | 2 +-
include/linux/mm_types.h | 10 +++
kernel/fork.c | 3 +-
mm/gup.c | 6 ++
mm/huge_memory.c | 28 ++++++
mm/memory.c | 186 ++++++++++++++++++++++++++++++++++-----
6 files changed, 212 insertions(+), 23 deletions(-)

--
2.26.2