Re: [PATCH] mm/memcontrol: update documentation about invoking oom killer

From: Konstantin Khlebnikov
Date: Sun Nov 03 2019 - 05:47:16 EST


On 03/11/2019 02.55, David Rientjes wrote:
On Sat, 2 Nov 2019, Konstantin Khlebnikov wrote:

Since commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
charge path") memcg invokes oom killer not only for user page-faults.
This means 0-order allocation will either succeed or task get killed.

Fixes: 8e675f7af507 ("mm/oom_kill: count global and memory cgroup oom kills")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>
---
Documentation/admin-guide/cgroup-v2.rst | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 5361ebec3361..eb47815e137b 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1219,8 +1219,13 @@ PAGE_SIZE multiple when read back.
Failed allocation in its turn could be returned into
userspace as -ENOMEM or silently ignored in cases like
- disk readahead. For now OOM in memory cgroup kills
- tasks iff shortage has happened inside page fault.
+ disk readahead.
+
+ Before 4.19 OOM in memory cgroup killed tasks iff
+ shortage has happened inside page fault, random
+ syscall may fail with ENOMEM or EFAULT. Since 4.19
+ failed memory cgroup allocation invokes oom killer and
+ keeps retrying until it succeeds.
This event is not raised if the OOM killer is not
considered as an option, e.g. for failed high-order

The previous text is obviously incorrect for today's kernels, but I'm
curious if we should be conflating the documentation here by describing
the pre-4.19 behavior. OOM killing no longer happens only on page fault
so maybe better to document the exact behavior today and not attempt to
describe differences with previous versions?


Previous behaviour was here for ages and 4.19 is not so old.
According too https://www.kernel.org/category/releases.html pre-4.19 will
be maintained for couple years at least. Let's keep this tombstone.

I've seen a lot of strange side effects of old behaviour.
Most obscure was a hang inside libc fork() when clone(CLONE_CHILD_SETTID)
silently fails to set child pid =)
https://lore.kernel.org/lkml/20150206162301.18031.32251.stgit@buzz/