[PATCH] oom: do not live lock on frozen tasks

From: Michal Hocko
Date: Fri Aug 26 2011 - 04:39:35 EST


[WARNING untested]

OOM can end up in a live lock if select_bad_process picks up a frozen
task. On the other hand we cannot mark such processes as unkillable
because we could panic the system even though there is a chance that
somebody could thaw the process so we can make a forward process (e.g.
a process from another cpuset or with a different nodemask).

Let's give all frozen tasks a bonus (OOM_SCORE_ADJ_MAX/2) so that we do
not consider them unless really necessary and if we really pick up one
then thaw its threads before we try to kill it.

TODO
- given bonus might be too big?
- aren't we racing with try_to_freeze_tasks?
---
mm/oom_kill.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 626303b..fd194bc 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -32,6 +32,7 @@
#include <linux/mempolicy.h>
#include <linux/security.h>
#include <linux/ptrace.h>
+#include <linux/freezer.h>

int sysctl_panic_on_oom;
int sysctl_oom_kill_allocating_task;
@@ -214,6 +215,14 @@ unsigned int oom_badness(struct task_struct *p, struct mem_cgroup *mem,
points += p->signal->oom_score_adj;

/*
+ * Do not try to kill frozen tasks unless there is nothing else to kill.
+ * We do not want to give it 1 point because we still want to select a good
+ * candidate among all frozen tasks. Let's give it a reasonable bonus.
+ */
+ if (frozen(p))
+ points -= OOM_SCORE_ADJ_MAX/2;
+
+ /*
* Never return 0 for an eligible task that may be killed since it's
* possible that no single user task uses more than 0.1% of memory and
* no single admin tasks uses more than 3.0%.
@@ -450,6 +459,10 @@ static int oom_kill_task(struct task_struct *p, struct mem_cgroup *mem)
pr_err("Kill process %d (%s) sharing same memory\n",
task_pid_nr(q), q->comm);
task_unlock(q);
+
+ if (frozen(q))
+ thaw_process(q);
+
force_sig(SIGKILL, q);
}

--
1.7.5.4

--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/