Re: [PATCH v2] mm/mmap: return early if it can't merge in vma_merge()

From: Yajun Deng
Date: Thu Feb 22 2024 - 02:47:28 EST



On 2024/2/22 04:41, Lorenzo Stoakes wrote:
On Wed, Feb 21, 2024 at 10:38:27AM -0500, Liam R. Howlett wrote:
* Yajun Deng <yajun.deng@xxxxxxxxx> [240221 04:15]:
In most cases, the range of the area is valid. But in do_mprotect_pkey(),
the minimum value of end and vma->vm_end is passed to mprotect_fixup().
This will lead to the end is less than the end of prev.

In this case, the curr will be NULL, but the next will be equal to the
prev. So it will attempt to merge before, the vm_pgoff check will cause
this case to fail.

To avoid the process described above and reduce unnecessary operations.
Add a check to immediately return NULL if the end is less than the end of
prev.
If it's only one caller, could we stop that caller instead of checking
an almost never case for all callers? Would this better fit in
vma_modify()? Although that's not just for this caller at this point.
Maybe there isn't a good place?
I definitely agree with Liam that this should not be in vma_merge(), as
it's not going to be relevant to _most_ callers.

I am not sure vma_modify() is much better, this would be the only early
exit check in that function and makes what is very simple and
straightforward now more confusing.


There are two paths that will cause this case. One is in mprotect_fixup(), the other is in

madvise_update_vma().


One way is to separate out the split_vma() from vma_modify(). And create a new helper function.

We can call it directly at source, but we need to do this in both paths.  It's more complicated.


The other way is to add a check in vma_modify(). Like the following:

diff --git a/mm/mmap.c b/mm/mmap.c
index 0fccd23f056e..f93f1d3939f2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2431,11 +2431,15 @@ struct vm_area_struct *vma_modify(struct vma_iterator *vmi,
        pgoff_t pgoff = vma->vm_pgoff + ((start - vma->vm_start) >> PAGE_SHIFT);
        struct vm_area_struct *merged;

+       if (prev && end < prev->vm_end)
+               goto cannot_merge;
+
        merged = vma_merge(vmi, prev, vma, start, end, vm_flags,
                           pgoff, policy, uffd_ctx, anon_name);
        if (merged)
                return merged;

+cannot_merge:
        if (vma->vm_start < start) {
                int err = split_vma(vmi, vma, start, 1);


And I think this is the crux of it - it's confusing that we special case
this one particular non-merge scenario, but no others (all of which we then
deem ok to be caught by the usual rules).

I think it's simpler (and more efficient) to just keep things the way they
are.

Or are there other reasons this may happen and is better done in this
function?

Often, this is called the "punch a hole" scenario; where an operation
creates two entries from the old data and either leaves an empty space
or fills the space with a new VMA.

Signed-off-by: Yajun Deng <yajun.deng@xxxxxxxxx>
---
v2: remove the case label.
v1: https://lore.kernel.org/all/20240218085028.3294332-1-yajun.deng@xxxxxxxxx/
---
mm/mmap.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index 0fccd23f056e..7668854d2246 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -890,6 +890,9 @@ static struct vm_area_struct
if (vm_flags & VM_SPECIAL)
return NULL;

+ if (prev && end < prev->vm_end)
+ return NULL;
+
/* Does the input range span an existing VMA? (cases 5 - 8) */
curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);

--
2.25.1

So overall I don't think this check makes much sense anywhere.

I think a better solution would be to prevent it happening _at source_ if
you can find a logical way of doing so.

I do plan to do some cleanup passes over this stuff once again so maybe I
can figure something out that better handles non-merge scenarios.

This is a great find though overall even if a patch doesn't make sense
Yajun, thanks for this, it's really made me think about this case (+ others
like it) :)