Re: [patch -mm v2] mm: introduce oom_adj_child

From: KAMEZAWA Hiroyuki
Date: Thu Jul 30 2009 - 20:35:02 EST


On Thu, 30 Jul 2009 12:05:30 -0700 (PDT)
David Rientjes <rientjes@xxxxxxxxxx> wrote:

> On Thu, 30 Jul 2009, KAMEZAWA Hiroyuki wrote:
>
> > > If you have suggestions for a better name, I'd happily ack it.
> > >
> >
> > Simply, reset_oom_adj_at_new_mm_context or some.
> >
>
> I think it's preferred to keep the name relatively short which is an
> unfortuante requirement in this case. I also prefer to start the name
> with "oom_adj" so it appears alongside /proc/pid/oom_adj when listed
> alphabetically.
>
But misleading name is bad.



> > > > 2. More simple plan is like this, IIUC.
> > > >
> > > > fix oom-killer's select_bad_process() not to be in deadlock.
> > > >
> > >
> > > Alternate ideas?
> > >
> > At brief thiking.
> >
> > 1. move oom_adj from mm_struct to signal struct. or somewhere.
> > (see copy_signal())
> > Then,
> > - all threads in a process will have the same oom_adj.
> > - vfork()'ed thread will inherit its parent's oom_adj.
> > - vfork()'ed thread can override oom_adj of its own.
> >
> > In other words, oom_adj is shared when CLONE_PARENT is not set.
> >
>
> Hmm, didn't we talk about signal_struct already? The problem with that
> approach is that oom_adj values represent a killable quantity of memory,
> so having multiple threads sharing the same mm_struct with one set to
> OOM_DISABLE and the other at +15 will still livelock because the oom
> killer can't kill either.
>
> > 2. rename mm_struct's oom_adj as shadow_oom_adj.
> >
> > update this shadow_oom_adj as the highest oom_adj among
> > the values all threads share this mm_struct have.
> > This update is done when
> > - mm_init()
> > - oom_adj is written.
> >
> > User's
> > # echo XXXX > /proc/<x>/oom_adj
> > is not necessary to be very very fast.
> >
> > I don't think a process which calls vfork() is multi-threaded.
> >
> > 3. use shadow_oom_adj in select_bad_process().
> >
>
> Ideas 2 & 3 here seem to be a single proposal. The problem is that it
> still leaves /proc/pid/oom_score to be inconsistent with the badness
> scoring that the oom killer will eventually use since if it oom kills one
> task, it must kill all tasks sharing the same mm_struct to lead to future
> memory freeing.
>
yes.

> Additionally, if you were to set one thread to OOM_DISABLE, storing the
> highest oom_adj value in mm_struct isn't going to help because
> oom_kill_task() will still require a tasklist scan to ensure no threads
> sharing the mm_struct are OOM_DISABLE and the livelock persists.
>

Why don't you think select_bad_process()-> oom_kill_task() implementation is bad ?
IMHO, it's bad manner to fix an os-implementation problem by adding _new_ user
interface which is hard to understand.


> In other words, the issue here is larger than the inheritance of the
> oom_adj value amongst children, it addresses a livelock that neither of
> your approaches solve. The fix actually makes /proc/pid/oom_adj (and
> /proc/pid/oom_score) consistent with how the oom killer behaves.

This oom_adj_child itself is not related to livelock problem. Don't make
the problem bigger than it is.
oom_adj_child itself is just a problem how to handle vfork().

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/