* Yajun Deng <yajun.deng@xxxxxxxxx> [240220 22:26]:
On 2024/2/21 02:06, Liam R. Howlett wrote:Ah, so the code doesn't match the comment, since the code will still use
* Yajun Deng <yajun.deng@xxxxxxxxx> [240218 21:30]:Because the mas_reset() and mas_*_range() only covert mas to vmi. It's
Cc: Vlastimil, Lorenzo,SurenThanks for looking at this.
On 2024/2/18 10:31, Yajun Deng wrote:
There are two types of iterators mas and vmi in the current code. If the
maple tree comes from the mm struct, we can use vma iterator. Avoid using
mas directly.
I had left the maple state exposed in the mmap.c file because it does a
number of operations that no one else does, so the new functions will be
called a very limited number of times (as low as once).
I think this is a worth while change since this may be needed in the
future for dealing with more specialised uses of the tree. It also
removes the 0/ULONG_MAX limits from certain calls, and the vma iterator
names help explain things.
I don't know why you treat the low/high search differently than the
mas_reset() and mas_*_range(). In any case, the comment is inaccurate
when mas_ functions are called with &vmi.mas in places.
simple.
But the low/high also covert max to max - 1. It's a little more complex.
mas directly in this version. This was, perhaps, the largest issue with
the patch. Having a good patch log is very important as people rely on
it during reviews, but more importantly when tracking down an issue
later on.
I like the idea of removing as many mas uses as feasible, but we will
still have a few that must be passed through, so please change the
wording.
...Well, it's still VMAs from the mm struct. I agree that we should notLeave the mt_detach tree keep using mas, as it doesn't come from the mm
struct.
change this for now.
Do you mean mas functions? You do pass the maple state through to otherConvert all mas except mas_detach to vma iterator. And introduce
vma_iter_area_{lowest, highest} helper functions for use vma interator.
areas. ie: free_pgtables().
Yes.
Thanks. This works and the variable isn't used again.Yes, the following changes 'if (vm_start_gap(tmp) <= gap_end)' to 'ifvma_iter_end will return vmi->mas.last + 1, is what you have hereretry:
- if (mas_empty_area_rev(&mas, low_limit, high_limit - 1, length))
+ if (vma_iter_area_highest(&vmi, low_limit, high_limit, length))
return -ENOMEM;
- gap = mas.last + 1 - info->length;
+ gap = vma_iter_end(&vmi) - info->length;
gap -= (gap - info->align_offset) & info->align_mask;
- gap_end = mas.last;
- tmp = mas_next(&mas, ULONG_MAX);
+ gap_end = vma_iter_end(&vmi);
correct?
(vm_start_gap(tmp) < gap_end)'.
+ tmp = vma_next(&vmi);
if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */
- if (vm_start_gap(tmp) <= gap_end) {
+ if (vm_start_gap(tmp) < gap_end) {
...
This is confusing. I think you are doing this so that the vma iterator@@ -1959,11 +1958,12 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
struct vm_area_struct *next;
unsigned long gap_addr;
int error = 0;
- MA_STATE(mas, &mm->mm_mt, vma->vm_start, address);
+ VMA_ITERATOR(vmi, mm, 0);
if (!(vma->vm_flags & VM_GROWSUP))
return -EFAULT;
+ vma_iter_config(&vmi, vma->vm_start, address);
is set up the same as the maple state, and not what is logically
necessary?
Yes, VMA_ITERATOR can only pass one address.
This isn't really hiding the maple state./* Guard against exceeding limits of the address space. */
address &= PAGE_MASK;
if (address >= (TASK_SIZE & PAGE_MASK))
@@ -1985,15 +1985,15 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
}
if (next)
- mas_prev_range(&mas, address);
+ mas_prev_range(&vmi.mas, address);
Okay, I will create a new helper function for this in the mm/internal.h.
The above maple state changes is to get the maple state to point to the- __mas_set_range(&mas, vma->vm_start, address - 1);
- if (mas_preallocate(&mas, vma, GFP_KERNEL))
+ vma_iter_config(&vmi, vma->vm_start, address);
correct area for the preallocation call below. This seems unnecessary
to me.
We really should just set it up correctly. Unfortunately, with the VMA
iterator, that's not really possible on initialization.
What we can do is use the vma->vm_start for the initialization, then use
vma_iter_config() here. That will not reset any state - but that's fine
because the preallocation is the first call that actually uses it
anyways.
So we can initialize with vma->vm_start, don't call vma_iter_config
until here, and also drop the if (next) part.
This is possible here because it's not optimised like the
expand_upwards() case, which uses the state to check prev and avoids an
extra walk.
Please make sure to test with the ltp tests on the stack combining, etc
on a platform that expands down.
Testing this can be tricky. Thanks for looking at it.
Okay, I will test it.
...
That's okay, we can leave that for another time. I believe PengI guess the page tables still deal with the maple state directly then.mmap_write_lock(mm);
mt_clear_in_rcu(&mm->mm_mt);
- mas_set(&mas, vma->vm_end);
- free_pgtables(&tlb, &mas, vma, FIRST_USER_ADDRESS,
+ vma_iter_set(&vmi, vma->vm_end);
+ free_pgtables(&tlb, &vmi.mas, vma, FIRST_USER_ADDRESS,
USER_PGTABLES_CEILING, true);
Yes.
complicated the internals of free_pgtables() with his forking
optimisation so care will need to be taken and should probably be done
in another patch, another time. In fact, hiding xa_is_zero() within the
vma iterator is going to be a really good thing to do, if performance
doesn't suffer.
Just don't state we are removing the use of maple state in the change
log - since we do pass it through sometimes.
...
Thanks again,
Liam