Re: [RFC PATCH 0/4] Removing limitations of merging anonymous VMAs
From: Liam Howlett
Date: Fri Feb 25 2022 - 09:25:47 EST
* Vlastimil Babka <vbabka@xxxxxxx> [220225 05:31]:
> On 2/18/22 20:21, Liam Howlett wrote:
> > * Jakub Matěna <matenajakub@xxxxxxxxx> [220218 07:21]:
> >> Motivation
> >> In the current kernel it is impossible to merge two anonymous VMAs
> >> if one of them was moved. That is because VMA's page offset is
> >> set according to the virtual address where it was created and in
> >> order to merge two VMA's page offsets need to follow up.
> >> Another problem when merging two VMA's is their anon_vma. In
> >> current kernel these anon_vmas have to be the one and the same.
> >> Otherwise merge is again not allowed.
> >> Missed merge opportunities increase the number of VMAs of a process
> >> and in some cases can cause problems when a max count is reached.
> >
> > Does this really happen that much? Is it worth trying even harder to
>
> Let me perhaps clarify. Maybe not in general, but some mremap() heavy
> workloads fragment VMA space a lot, have to increase the vma limits etc.
> While the original motivation was a proprietary workload, there are e.g.
> allocators such as jemalloc that rely on mremap().
>
> But yes, it might turn out that the benefit is not universal and we might
> consider some ways to make more aggressive merging opt-in.
>
> > merge VMAs? I am not really sure the VMA merging today is worth it - we
> > are under a lock known to be a bottleneck while examining if it's
>
> I'd be afraid that by scaling back existing merging would break some
> userspace expectations inspecting e.g. /proc/pid/maps
Is that a risk considering how many things stop the merging of VMAs? We
just added another (names). Not all the information can be in
/proc/pid/maps - otherwise the tracing patch wouldn't really be
necessary?
>
> > possible to merge. Hard data about how often and the cost of merging
> > would be a good argument to try harder or give up earlier.
> >
> >>
> >> Solution
> >> Following series of these patches solves the first problem with
> >> page offsets by updating them when the VMA is moved to a
> >> different virtual address (patch 2). As for the second
> >> problem merging of VMAs with different anon_vma is allowed
> >> (patch 3). Patch 1 refactors function vma_merge and
> >> makes it easier to understand and also allows relatively
> >> seamless tracing of successful merges introduced by the patch 4.
> >>
> >> Limitations
> >> For both problems solution works only for VMAs that do not share
> >> physical pages with other processes (usually child or parent
> >> processes). This is checked by looking at anon_vma of the respective
> >> VMA. The reason why it is not possible or at least not easy to
> >> accomplish is that each physical page has a pointer to anon_vma and
> >> page offset. And when this physical page is shared we cannot simply
> >> change these parameters without affecting all of the VMAs mapping
> >> this physical page. Good thing is that this case amounts only for
> >> about 1-3% of all merges (measured for internet browsing and
> >> compilation use cases) that fail to merge in the current kernel.
> >
> > It sounds like you have data for some use cases on the mergers already.
> > Do you have any results on this change?
> >
> >>
> >> This series of patches and documentation of the related code will
> >> be part of my master's thesis.
> >> This patch series is based on tag v5.17-rc4.
> >>
> >> Jakub Matěna (4):
> >> mm: refactor of vma_merge()
> >> mm: adjust page offset in mremap
> >> mm: enable merging of VMAs with different anon_vmas
> >> mm: add tracing for VMA merges
> >>
> >> include/linux/rmap.h | 17 ++-
> >> include/trace/events/mmap.h | 55 +++++++++
> >> mm/internal.h | 11 ++
> >> mm/mmap.c | 232 ++++++++++++++++++++++++++----------
> >> mm/rmap.c | 40 +++++++
> >> 5 files changed, 290 insertions(+), 65 deletions(-)
> >>
> >> --
> >> 2.34.1
> >>
>