Re: [patch 1/2] mm, memcg: avoid oom notification when current needsaccess to memory reserves

From: David Rientjes
Date: Tue Dec 03 2013 - 18:50:51 EST


On Tue, 3 Dec 2013, Michal Hocko wrote:

> OK, as it seems that the notification part is too controversial, how
> would you like the following? It reverts the notification part and still
> solves the fault on exit path. I will prepare the full patch with the
> changelog if this looks reasonable:

Um, no, that's not satisfactory because it obviously does the check after
mem_cgroup_oom_notify(). There is absolutely no reason why userspace
should be woken up when current simply needs access to memory reserves to
exit. You can already get such notification by memory thresholds at the
memcg limit.

I'll repeat: Section 10 of Documentation/cgroups/memory.txt specifies what
userspace should do when waking up; one of those options is not "check if
the memcg is still actually oom in a short period of time once a charging
task with a pending SIGKILL or in the exit path has been able to exit."
Users of this interface typically also disable the memcg oom killer
through the same file, it's ludicrous to put the responsibility on
userspace to determine if the wakeup is actionable and requires it to
intervene in one of the methods listed in section 10.

> ---
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 28c9221b74ea..f44fe7e65a98 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1783,6 +1783,16 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> unsigned int points = 0;
> struct task_struct *chosen = NULL;
>
> + /*
> + * If current has a pending SIGKILL or is exiting, then automatically
> + * select it. The goal is to allow it to allocate so that it may
> + * quickly exit and free its memory.
> + */
> + if (fatal_signal_pending(current) || current->flags & PF_EXITING) {
> + set_thread_flag(TIF_MEMDIE);
> + goto cleanup;
> + }
> +
> check_panic_on_oom(CONSTRAINT_MEMCG, gfp_mask, order, NULL);
> totalpages = mem_cgroup_get_limit(memcg) >> PAGE_SHIFT ? : 1;
> for_each_mem_cgroup_tree(iter, memcg) {
> @@ -2233,16 +2243,6 @@ bool mem_cgroup_oom_synchronize(bool handle)
> if (!handle)
> goto cleanup;
>
> - /*
> - * If current has a pending SIGKILL or is exiting, then automatically
> - * select it. The goal is to allow it to allocate so that it may
> - * quickly exit and free its memory.
> - */
> - if (fatal_signal_pending(current) || current->flags & PF_EXITING) {
> - set_thread_flag(TIF_MEMDIE);
> - goto cleanup;
> - }
> -
> owait.memcg = memcg;
> owait.wait.flags = 0;
> owait.wait.func = memcg_oom_wake_function;
> @@ -2266,6 +2266,13 @@ bool mem_cgroup_oom_synchronize(bool handle)
> schedule();
> mem_cgroup_unmark_under_oom(memcg);
> finish_wait(&memcg_oom_waitq, &owait.wait);
> +
> + /* Userspace OOM handler cannot set TIF_MEMDIE to a target */
> + if (memcg->oom_kill_disable) {
> + if ((fatal_signal_pending(current) ||
> + current->flags & PF_EXITING))
> + set_thread_flag(TIF_MEMDIE);
> + }
> }
>
> if (locked) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/