Johannes Weiner wrote:
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 5cfda39b3268..e066ac7353a4 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -711,12 +711,15 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
killed = 1;
}
out:
+ if (test_thread_flag(TIF_MEMDIE))
+ return true;
/*
- * Give the killed threads a good chance of exiting before trying to
- * allocate memory again.
+ * Wait for any outstanding OOM victims to die. In rare cases
+ * victims can get stuck behind the allocating tasks, so the
+ * wait needs to be bounded. It's crude alright, but cheaper
+ * than keeping a global dependency tree between all tasks.
*/
- if (killed)
- schedule_timeout_killable(1);
+ wait_event_timeout(oom_victims_wait, !atomic_read(&oom_victims), HZ);
return true;
}
out_of_memory() returning true with bounded wait effectively means that
wait forever without choosing subsequent OOM victims when first OOM victim
failed to die. The system will lock up, won't it?