[PATCH] oom: always panic on OOM when panic_on_oom is configured

From: Michal Hocko
Date: Mon Jun 01 2015 - 07:59:47 EST


panic_on_oom allows administrator to set OOM policy to panic the system
when it is out of memory to reduce failover time e.g. when resolving
the OOM condition would take much more time than rebooting the system.

out_of_memory tries to be clever and prevent from premature panics
by checking the current task and prevent from panic when the task
has fatal signal pending and so it should die shortly and release some
memory. This is fair enough but Tetsuo Handa has noted that this might
lead to a silent deadlock when current cannot exit because of
dependencies invisible to the OOM killer.

panic_on_oom is disabled by default and if somebody enables it then any
risk of potential deadlock is certainly unwelcome. The risk is really
low because there are usually more sources of allocation requests and
one of them would eventually trigger the panic but it is better to
reduce the risk as much as possible.

Let's move check_panic_on_oom up before the current task is
checked so that the knob value is . Do the same for the memcg in
mem_cgroup_out_of_memory.

Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
---
mm/memcontrol.c | 3 ++-
mm/oom_kill.c | 18 +++++++++---------
2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 86648a718d21..d3c906da6a09 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1532,6 +1532,8 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,

mutex_lock(&oom_lock);

+ check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL, memcg);
+
/*
* If current has a pending SIGKILL or is exiting, then automatically
* select it. The goal is to allow it to allocate so that it may
@@ -1542,7 +1544,6 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
goto unlock;
}

- check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL, memcg);
totalpages = mem_cgroup_get_limit(memcg) ? : 1;
for_each_mem_cgroup_tree(iter, memcg) {
struct css_task_iter it;
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index dff991e0681e..f8c83b791dd5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -667,6 +667,15 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
goto out;

/*
+ * Check if there were limitations on the allocation (only relevant for
+ * NUMA) that may require different handling.
+ */
+ constraint = constrained_alloc(zonelist, gfp_mask, nodemask,
+ &totalpages);
+ mpol_mask = (constraint == CONSTRAINT_MEMORY_POLICY) ? nodemask : NULL;
+ check_panic_on_oom(constraint, gfp_mask, order, mpol_mask, NULL);
+
+ /*
* If current has a pending SIGKILL or is exiting, then automatically
* select it. The goal is to allow it to allocate so that it may
* quickly exit and free its memory.
@@ -680,15 +689,6 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
goto out;
}

- /*
- * Check if there were limitations on the allocation (only relevant for
- * NUMA) that may require different handling.
- */
- constraint = constrained_alloc(zonelist, gfp_mask, nodemask,
- &totalpages);
- mpol_mask = (constraint == CONSTRAINT_MEMORY_POLICY) ? nodemask : NULL;
- check_panic_on_oom(constraint, gfp_mask, order, mpol_mask, NULL);
-
if (sysctl_oom_kill_allocating_task && current->mm &&
!oom_unkillable_task(current, NULL, nodemask) &&
current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) {
--
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/