Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK

From: Linus Torvalds
Date: Fri Aug 11 2017 - 15:42:46 EST


On Fri, Aug 11, 2017 at 12:19 PM, <riel@xxxxxxxxxx> wrote:
> diff --git a/mm/memory.c b/mm/memory.c
> index 0e517be91a89..f9b0ad7feb57 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1134,6 +1134,16 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> !vma->anon_vma)
> return 0;
>
> + /*
> + * With VM_WIPEONFORK, the child inherits the VMA from the
> + * parent, but not its contents.
> + *
> + * A child accessing VM_WIPEONFORK memory will see all zeroes;
> + * a child accessing VM_DONTCOPY memory receives a segfault.
> + */
> + if (vma->vm_flags & VM_WIPEONFORK)
> + return 0;
> +

Is this right?

Yes, you don't do the page table copies. Fine. But you leave vma with
the the anon_vma pointer - doesn't that mean that it's still connected
to the original anonvma chain, and we might end up swapping something
in?

And even if that ends up not being an issue, I'd expect that you'd
want to break the anon_vma chain just to not make it grow
unnecessarily.

So my gut feel is that doing this in "copy_page_range()" is wrong, and
the logic should be moved up to dup_mmap(), where we can also
short-circuit the anon_vma chain entirely.

No?

The madvice() interface looks fine to me.

Linus