Re: [PATCH for 2.6.31 0/4] fix oom_adj regression v2

From: Andrew Morton
Date: Wed Aug 05 2009 - 19:40:45 EST


On Tue, 4 Aug 2009 19:25:08 +0900 (JST) KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:

> The commit 2ff05b2b (oom: move oom_adj value) move oom_adj value to mm_struct.
> It is very good first step for sanitize OOM.
>
> However Paul Menage reported the commit makes regression to his job scheduler.
> Current OOM logic can kill OOM_DISABLED process.
>
> Why? His program has the code of similar to the following.
>
> ...
> set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
> ...
> if (vfork() == 0) {
> set_oom_adj(0); /* Invoked child can be killed */
> execve("foo-bar-cmd")
> }
> ....
>
> vfork() parent and child are shared the same mm_struct. then above set_oom_adj(0) doesn't
> only change oom_adj for vfork() child, it's also change oom_adj for vfork() parent.
> Then, vfork() parent (job scheduler) lost OOM immune and it was killed.
>
> Actually, fork-setting-exec idiom is very frequently used in userland program. We must
> not break this assumption.
>
> This patch series are slightly big, but we must fix any regression soon.
>

So I merged these but I have a feeling that this isn't the last I'll be
hearing on the topic ;)

Given the amount of churn, the amount of discussion and the size of the
patches, this doesn't look like something we should push into 2.6.31.

If we think that the 2ff05b2b regression is sufficiently serious to be
a must-fix for 2.6.31 then can we please find something safer and
smaller? Like reverting 2ff05b2b?


These patches clash with the controversial
mm-introduce-proc-pid-oom_adj_child.patch, so I've disabled that patch
now.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/