Re: [PATCH] mm/util.c: Add error logs for commitment overflow

From: Michal Hocko
Date: Fri Oct 02 2020 - 08:17:30 EST


On Fri 02-10-20 17:27:41, Pintu Kumar wrote:
> The headless embedded devices often come with very limited amount
> of RAM such as: 256MB or even lesser.
> These types of system often rely on command line interface which can
> execute system commands in the background using the fork/exec combination.
> There could be even many child tasks invoked internally to handle multiple
> requests.
> In this scenario, if the parent task keeps committing large amount of
> memory, there are chances that this commitment can easily overflow the
> total RAM available in the system. Now if the parent process invokes fork
> or system commands (using system() call) and the commitment ratio is at
> 50%, the request fails with the following, even though there are large
> amount of free memory available in the system:
> fork failed: Cannot allocate memory
>
> If there are too many 3rd party tasks calling fork, it becomes difficult to
> identify exactly which parent process is overcommitting memory.
> Since, free memory is also available, this "Cannot allocate memory" from
> fork creates confusion to application developer.
>
> Thus, I found that this simple print message (even once) is helping in
> quickly identifying the culprit.
>
> This is the output we can see on a 256MB system and with a simple malloc
> and fork program.
>
> [root@ ~]# cat /proc/meminfo
> MemTotal: 249520 kB ==> 243MB
> MemFree: 179100 kB
>
> PPID PID USER RSS VSZ STAT ARGS
> 150 164 root 1440 250580 S ./consume-and-fork.out 243
>
> __vm_enough_memory: commitment overflow: ppid:150, pid:164, pages:62451
> fork failed[count:0]: Cannot allocate memory

While I understand that fork failing due to overrcomit heuristic is non
intuitive and I have seen people scratching heads due to this in the
past I am not convinced this is a right approach to tackle the problem.
First off, referencing pids is not really going to help much if process
is short lived. Secondly, __vm_enough_memory is about any address space
allocation. Why would you be interested in parent when doing mmap?

Last but not least _once is questionable as well. The first instance
might happen early during the system lifetime and you will not learn
about future failures so the overall point of debuggability is seriously
inhibited.

Maybe what you want is to report higher up the call chain (fork?) and
have it ratelimited rather than _once? Or maybe just try to live with
the confusing situation?

> Signed-off-by: Pintu Kumar <pintu@xxxxxxxxxxxxxx>
> ---
> mm/util.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/util.c b/mm/util.c
> index 5ef378a..9431ce7a 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -895,6 +895,9 @@ int __vm_enough_memory(struct mm_struct *mm, long pages, int cap_sys_admin)
> error:
> vm_unacct_memory(pages);
>
> + pr_err_once("%s: commitment overflow: ppid:%d, pid:%d, pages:%ld\n",
> + __func__, current->parent->pid, current->pid, pages);
> +
> return -ENOMEM;
> }
>
> --
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.,
> is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

--
Michal Hocko
SUSE Labs