Re: [patch] mm, madvise: fail with ENOMEM when splitting vma will hit max_map_count

From: David Rientjes
Date: Wed Jan 25 2017 - 17:15:11 EST

On Wed, 25 Jan 2017, Anshuman Khandual wrote:

> But in the due course there might be other changes in number of VMAs of
> the process because of unmap() or merge() which could reduce the total
> number of VMAs and hence this condition may not exist afterwards. In
> that case EAGAIN still makes sense.

Imagine a singlethreaded process that is operating on its own privately
mapped memory. Attempting to split an existing vma and meeting
vm.max_map_count is not something that will be fixed by trying again, i.e.
it is not helpful to loop when madvise() returns -1 with errno EAGAIN if
vm.max_map_count will always be encountered. The other cases where ENOMEM
is blindly converted to EAGAIN is when slab allocation fails which can
encounter external freeing, the meaning of "kernel resource is temporarily
unavailable." There is no such guarantee for vm.max_map_count, so ENOMEM
clearly indicates the failure.

After this, it makes sense for userspace to loop for advice such as
MADV_DONTNEED because we are actively freeing memory when EAGAIN is
returned. If we are meeting vm.max_map_count, this will infinitely loop.
This is the case in tcmalloc and this patch addresses the issue when
vm.max_map_count is low.