Re: [PATCH] mm: centralize and fix max map count limit checking

From: Kalesh Singh
Date: Thu Sep 04 2025 - 12:25:39 EST


On Thu, Sep 4, 2025 at 3:14 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 04.09.25 01:24, Kalesh Singh wrote:
> > The check against the max map count (sysctl_max_map_count) was
> > open-coded in several places. This led to inconsistent enforcement
> > and subtle bugs where the limit could be exceeded.
> >
> > For example, some paths would check map_count > sysctl_max_map_count
> > before allocating a new VMA and incrementing the count, allowing the
> > process to reach sysctl_max_map_count + 1:
> >
> > int do_brk_flags(...)
> > {
> > if (mm->map_count > sysctl_max_map_count)
> > return -ENOMEM;
> >
> > /* We can get here with mm->map_count == sysctl_max_map_count */
> >
> > vma = vm_area_alloc(mm);
> > ...
> > mm->map_count++ /* We've now exceeded the threshold. */
> > }
> >
> > To fix this and unify the logic, introduce a new function,
> > exceeds_max_map_count(), to consolidate the check. All open-coded
> > checks are replaced with calls to this new function, ensuring the
> > limit is applied uniformly and correctly.
> >
> > To improve encapsulation, sysctl_max_map_count is now static to
> > mm/mmap.c. The new helper also adds a rate-limited warning to make
> > debugging applications that exhaust their VMA limit easier.
> >
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Minchan Kim <minchan@xxxxxxxxxx>
> > Cc: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
> > Signed-off-by: Kalesh Singh <kaleshsingh@xxxxxxxxxx>
> > ---
> > include/linux/mm.h | 11 ++++++++++-
> > mm/mmap.c | 15 ++++++++++++++-
> > mm/mremap.c | 7 ++++---
> > mm/nommu.c | 2 +-
> > mm/util.c | 1 -
> > mm/vma.c | 6 +++---
> > 6 files changed, 32 insertions(+), 10 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 1ae97a0b8ec7..d4e64e6a9814 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -192,7 +192,16 @@ static inline void __mm_zero_struct_page(struct page *page)
> > #define MAPCOUNT_ELF_CORE_MARGIN (5)
> > #define DEFAULT_MAX_MAP_COUNT (USHRT_MAX - MAPCOUNT_ELF_CORE_MARGIN)
> >
> > -extern int sysctl_max_map_count;
> > +/**
> > + * exceeds_max_map_count - check if a VMA operation would exceed max_map_count
> > + * @mm: The memory descriptor for the process.
> > + * @new_vmas: The number of new VMAs the operation will create.
>
> It's not always a "will" right? At least I remember that this was the
> worst case scenario in some ("may split").
>
> "The number of new VMAs the operation may create in the worst case.
>

Hi Daivd,

You are correct. Cases like mremap account for the worst case (3 way
split on both src and dest). I'll update the description.

Thanks,
Kalesh
>
> --
> Cheers
>
> David / dhildenb
>