Re: [PATCH v4 23/66] mm/mmap: Use advanced maple tree API for mmap_region()
From: Vlastimil Babka
Date: Mon Jan 17 2022 - 11:39:00 EST
On 12/1/21 15:30, Liam Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>
>
> Changing mmap_region() to use the maple tree state and the advanced
> maple tree interface allows for a lot less tree walking.
>
> This change removes the last caller of munmap_vma_range(), so drop this
> unused function.
>
> Add vma_expand() to expand a VMA if possible by doing the necessary
> hugepage check, uprobe_munmap of files, dcache flush, modifications then
> undoing the detaches, etc.
>
> Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx>
> ---
> mm/mmap.c | 227 +++++++++++++++++++++++++++++++++++++++++-------------
> 1 file changed, 175 insertions(+), 52 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index c06c5b850e1e..b0b7e327bf8b 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -496,29 +496,6 @@ static inline struct vm_area_struct *__vma_next(struct mm_struct *mm,
> return vma->vm_next;
> }
>
> -/*
> - * munmap_vma_range() - munmap VMAs that overlap a range.
> - * @mm: The mm struct
> - * @start: The start of the range.
> - * @len: The length of the range.
> - * @pprev: pointer to the pointer that will be set to previous vm_area_struct
> - *
> - * Find all the vm_area_struct that overlap from @start to
> - * @end and munmap them. Set @pprev to the previous vm_area_struct.
> - *
> - * Returns: -ENOMEM on munmap failure or 0 on success.
> - */
> -static inline int
> -munmap_vma_range(struct mm_struct *mm, unsigned long start, unsigned long len,
> - struct vm_area_struct **pprev, struct list_head *uf)
> -{
> - // Needs optimization.
> - while (range_has_overlap(mm, start, start + len, pprev))
> - if (do_munmap(mm, start, len, uf))
> - return -ENOMEM;
> - return 0;
> -}
> -
> static unsigned long count_vma_pages_range(struct mm_struct *mm,
> unsigned long addr, unsigned long end)
> {
> @@ -619,6 +596,101 @@ static void __insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
> mm->map_count++;
> }
>
> +/*
> + * vma_expand - Expand an existing VMA
> + * @mas: The maple state
> + * @vma: The vma to expand
> + * @start: The start of the vma
> + * @end: The exclusive end of the vma
> + */
Looks like this, similarly to the brk case, replaces one path calling
vma_merge->__vma_adjust() with something else. But this one is better
encapsulated and visible, so less likely to be forgotten in case something
changes. Would be even better if the brk case used it too :) seems like it
doesn't, at least as of this patch.
But it would be great to improve the documentation here - some params are
not documented, notably 'next', and I would explicitly state which scenarios
it does cover - i.e. vma_merge() lists 8 scenarios and I assume this can
handlea subset of those?
And scenarios not covered could be VM_WARN_ON'd?
Without such stated assumptions, it's hard/impossible to review both the
implementation against them and, and the callers against them.
> +inline int vma_expand(struct ma_state *mas, struct vm_area_struct *vma,
> + unsigned long start, unsigned long end, pgoff_t pgoff,
> + struct vm_area_struct *next)
> +{
> + struct mm_struct *mm = vma->vm_mm;
> + struct address_space *mapping = NULL;
> + struct rb_root_cached *root = NULL;
> + struct anon_vma *anon_vma = vma->anon_vma;
> + struct file *file = vma->vm_file;
> + bool remove_next = false;
> + int error;
> +
> + if (next && (vma != next) && (end == next->vm_end)) {
For example here this suggests that a case of 'end > next->vm_end' case is
not covered? How do I know whether it's intended or a bug?