Re: [RFC PATCH RESEND 07/28] kernel/fork: mark VMAs as locked before copying pages during fork

From: Suren Baghdasaryan
Date: Fri Sep 09 2022 - 12:30:04 EST


On Fri, Sep 9, 2022 at 6:27 AM Laurent Dufour <ldufour@xxxxxxxxxxxxx> wrote:
>
> Le 09/09/2022 à 01:57, Suren Baghdasaryan a écrit :
> > On Tue, Sep 6, 2022 at 7:38 AM Laurent Dufour <ldufour@xxxxxxxxxxxxx> wrote:
> >>
> >> Le 01/09/2022 à 19:34, Suren Baghdasaryan a écrit :
> >>> Protect VMAs from concurrent page fault handler while performing
> >>> copy_page_range for VMAs having VM_WIPEONFORK flag set.
> >>
> >> I'm wondering why is that necessary.
> >> The copied mm is write locked, and the destination one is not reachable.
> >> If any other readers are using the VMA, this is only for page fault handling.
> >
> > Correct, this is done to prevent page faulting in the VMA being
> > duplicated. I assume we want to prevent the pages in that VMA from
> > changing when we are calling copy_page_range(). Am I wrong?
>
> If a page is faulted while copy_page_range() is in progress, the page may
> not be backed on the child side (PTE lock should protect the copy, isn't it).
> Is that a real problem? It will be backed later if accessed on the child side.
> Maybe the per process pages accounting could be incorrect...

This feels to me like walking on the edge. Maybe we can discuss this
with more people at LPC before trying it?

>
> >
> >> I should have miss something because I can't see any need to mark the lock
> >> VMA here.
> >>
> >>> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> >>> ---
> >>> kernel/fork.c | 4 +++-
> >>> 1 file changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/kernel/fork.c b/kernel/fork.c
> >>> index bfab31ecd11e..1872ad549fed 100644
> >>> --- a/kernel/fork.c
> >>> +++ b/kernel/fork.c
> >>> @@ -709,8 +709,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> >>> rb_parent = &tmp->vm_rb;
> >>>
> >>> mm->map_count++;
> >>> - if (!(tmp->vm_flags & VM_WIPEONFORK))
> >>> + if (!(tmp->vm_flags & VM_WIPEONFORK)) {
> >>> + vma_mark_locked(mpnt);
> >>> retval = copy_page_range(tmp, mpnt);
> >>> + }
> >>>
> >>> if (tmp->vm_ops && tmp->vm_ops->open)
> >>> tmp->vm_ops->open(tmp);
> >>
>