RE: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

From: wangtao

Date: Tue Jun 02 2026 - 23:00:44 EST


> On 5/27/26 8:01 PM, tao wrote:
> > Design overview
> > ---------------
> >
> > ANON_VMA_LAZY defers anon_vma allocation until it is actually needed
> > (for example during fork). VMAs that never participate in sharing can
> > avoid creating anon_vma structures entirely.
> >
> > Before an anon_vma exists, rmap operations rely directly on VMA
> > information, so no anon_vma locking is required. An anon_vma is
> > created and linked only when sharing semantics are required.
>
> It is unfortunate that the design overview doesn't cover correctness aspect
> at all. VMAs are subject to change (even before being shared with other
> processes), and rmap needs something that doesn't go away across VMA
> merging, split, etc.
>
> I'm not sure how the idea is supposed work correctly.
>
> --
> Cheers,
> Harry / Hyeonggon

VMA operations can be roughly divided into three categories. The handling
of ANON_VMA_LAZY is briefly described below.

1. fork

fork duplicates the parent's mm/mmap. (exec creates a new mm/mmap and is
not involved here.) This can be viewed as copying the VMAs with identical
virtual addresses into a new address space.

If the parent VMA (pvma) is ANON_VMA_LAZY, it is first upgraded to a
regular anon_vma. The corresponding folio->mapping is then fixed in
try_dup_anon_rmap().

2. mmap / brk / mprotect / munmap

These operations create, modify, or remove VMAs in the current mm. They
may split existing VMAs, merge adjacent VMAs, or remove a VMA from mm_mt.

When a new VMA is created, vm_start, vm_end and vm_pgoff are initialized
and the VMA is inserted into mm_mt. Although these fields may later be
modified, the following value remains invariant:

(vm_start - vm_pgoff * PAGE_SIZE)

We refer to this value as:

vma_mapping_base(vma) = vma->vm_start - vma->vm_pgoff * PAGE_SIZE

This value also remains unchanged when the VMA is removed from mm_mt.

If a VMA is split and produces new_vma, the following holds:

vma_mapping_base(new_vma) == vma_mapping_base(vma)

If two adjacent VMAs vma_a and vma_b are merged into vma_x, then:

vma_mapping_base(vma_a) == vma_mapping_base(vma_b) ==
vma_mapping_base(vma_x)

Assume the VMA where the first page fault occurs is called root_vma, and
ensure that any VMA produced by split or merge holds a reference to
root_vma.

During rmap we can compute the folio address using root_vma:

vma_address(vma, pgoff, 1) =
vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT)
= vma_mapping_base(vma) + pgoff * PAGE_SIZE
= vma_mapping_base(root_vma) + folio_pgoff * PAGE_SIZE

We can then use folio_addr to locate the VMA covering this folio.

3. mremap / uffd_move

If only the size changes and the start address remains the same, there
is no impact.

If the start address changes, the page is moved from (vma, addr) to
(new_vma, new_addr). In this case:

vma_mapping_base(new_vma) =
vma_mapping_base(vma) + new_addr - old_addr

We first upgrade the VMA, and then fix folio->mapping in move_ptes().

If performance becomes a concern, ANON_VMA_LAZY can be enabled only for
relatively small VMAs.


vma操作可以分为3类,下面简单说明下ANON_VMA_LAZY的处理:

1. fork 从父进程复制mm/mmap;(exev 创建一个新的mm/mmap,不涉及)。
这可以理解为在一个新的地址空间复制一份相同地址的VMAs.
如果pvma是ANON_VMA_LAZY,先升级为regular anon_vma,并在try_dup_anon_rmap中升级修正folio->mapping.

2. mmap/brk/mprotect/munmap
创建、修改或删除当前mm的VMA,可能合并或拆分出新的VMAs或者将VMA从mm_mt删除。
创建一个新的vma并设置vm_start、vm_end、vm_pgoff插入mm_mt后,虽然后续可能修改这个VMA的vm_start、vm_end、vm_pgoff,但是保持
(vm_start - vm_pgoff * PAGE_SIZE)不变,我们可以把这个称之为vma_mapping_base(vma) = vma->vm_start - vma->vm_pgoff * PAGE_SIZE。
这个vma从mm_mt删除时,vma_mapping_base(vma)也保持不变。
从这个vma拆分出的new_vma,有vma_mapping_base(new_vma) == vma_mapping_base(vma)
合并相邻vma_a、vma_b为vma_x时,也有vma_mapping_base(vma_a) == vma_mapping_base(vma_b) == vma_mapping_base(vma_x)
如果我们第一次发生缺页的VMA称为root_vma,并在split或merge时都确保使用的vma持有root_vma的引用。
在rmap时我们可以用root_vma计算folio地址:
vma_address(vma, pgoff, 1) = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT)
= vma_mapping_base(vma) + pgoff * PAGE_SIZE
= vma_mapping_base(root_vma) + folio_pgoff * PAGE_SIZE
然后用folio_addr查找folio所在的vma。

3. mremap/uffd_move
如果只是修改大小,起始地址不变,不影响。
如果改变起始地址,将page从vma/addr移动到new_vma/new_addr
这时vma_mapping_base(new_vma) = vma_mapping_base(vma) + new_addr - old_addr
我们先升级vma,在move_ptes中再修正folio->mapping。
如果担心性能影响,可以只在较小的vma上使能ANON_VMA_LAZY。