Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

From: Michal Hocko
Date: Mon May 04 2020 - 02:56:07 EST


On Thu 30-04-20 11:27:12, Shakeel Butt wrote:
> Lowering memory.max can trigger an oom-kill if the reclaim does not
> succeed. However if oom-killer does not find a process for killing, it
> dumps a lot of warnings.

It shouldn't dump much more than the regular OOM report AFAICS. Sure
there is "Out of memory and no killable processes..." message printed as
well but is that a real problem?

> Deleting a memcg does not reclaim memory from it and the memory can
> linger till there is a memory pressure. One normal way to proactively
> reclaim such memory is to set memory.max to 0 just before deleting the
> memcg. However if some of the memcg's memory is pinned by others, this
> operation can trigger an oom-kill without any process and thus can log a
> lot un-needed warnings. So, ignore all such warnings from memory.max.

OK, I can see why you might want to use memory.max for that purpose but
I do not really understand why the oom report is a problem here.
memory.max can trigger the oom kill and user should be expecting the oom
report under that condition. Why is "no eligible task" so special? Is it
because you know that there won't be any tasks for your particular case?
What about other use cases where memory.max is not used as a "sweep
before tear down"?

> Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
> ---
> include/linux/oom.h | 3 +++
> mm/memcontrol.c | 9 +++++----
> mm/oom_kill.c | 2 +-
> 3 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index c696c265f019..6345dc55df64 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -52,6 +52,9 @@ struct oom_control {
>
> /* Used to print the constraint info. */
> enum oom_constraint constraint;
> +
> + /* Do not warn even if there is no process to be killed. */
> + bool no_warn;
> };
>
> extern struct mutex oom_lock;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 317dbbaac603..a1f00d9b9bb0 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1571,7 +1571,7 @@ unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
> }
>
> static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> - int order)
> + int order, bool no_warn)
> {
> struct oom_control oc = {
> .zonelist = NULL,
> @@ -1579,6 +1579,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> .memcg = memcg,
> .gfp_mask = gfp_mask,
> .order = order,
> + .no_warn = no_warn,
> };
> bool ret;
>
> @@ -1821,7 +1822,7 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> mem_cgroup_oom_notify(memcg);
>
> mem_cgroup_unmark_under_oom(memcg);
> - if (mem_cgroup_out_of_memory(memcg, mask, order))
> + if (mem_cgroup_out_of_memory(memcg, mask, order, false))
> ret = OOM_SUCCESS;
> else
> ret = OOM_FAILED;
> @@ -1880,7 +1881,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
> mem_cgroup_unmark_under_oom(memcg);
> finish_wait(&memcg_oom_waitq, &owait.wait);
> mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
> - current->memcg_oom_order);
> + current->memcg_oom_order, false);
> } else {
> schedule();
> mem_cgroup_unmark_under_oom(memcg);
> @@ -6106,7 +6107,7 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
> }
>
> memcg_memory_event(memcg, MEMCG_OOM);
> - if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0))
> + if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0, true))
> break;
> }
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 463b3d74a64a..5ace39f6fe1e 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -1098,7 +1098,7 @@ bool out_of_memory(struct oom_control *oc)
>
> select_bad_process(oc);
> /* Found nothing?!?! */
> - if (!oc->chosen) {
> + if (!oc->chosen && !oc->no_warn) {
> dump_header(oc, NULL);
> pr_warn("Out of memory and no killable processes...\n");
> /*
> --
> 2.26.2.526.g744177e7f7-goog

--
Michal Hocko
SUSE Labs