Re: mm->oom_disable_count is broken

From: David Rientjes
Date: Mon Aug 29 2011 - 19:18:13 EST


On Mon, 29 Aug 2011, Oleg Nesterov wrote:

> > IIRC, I did pointed out this issue. But nobody replied.
> > I think ->oom_disable_count is currently broken. but now I have no time to
> > audit this stuff. So, I'd suggest to revert this code if nobody don't fix it.
>
> I tend to agree, of course we can fix oom_disable_count but I don't
> really understand why do we want it.
>

I'd rather just remove it entirely, we'll have to ask it's author. Ying,
do you see a reason to keep oom_disable_count around?

The only thing that I can see it doing is preventing a thread that shares
an ->mm with an unkillable thread from being killed itself since it won't
lead to future memory freeing. It prevents the second tasklist iteration
after a task has been chosen to check if another thread sharing the memory
cannot be killed.

I'd rather just kill the thread anyway because there's a chance that the
OOM_DISABLE thread is waiting on it and may free its memory as well and
there's no guarantee that when you set a thread to be OOM_DISABLE that all
threads sharing the same memory are disabled as well.

> And. personally I dislike it because ->oom_disable_count is just another
> proof that ->oom_score_adj should be in ->mm, not per-process. IIRC,
> you already explained me why we can't do this, but - sorry - I forgot.
> May be something with vfork... Could you explain this again?
>

I actually really wanted oom_score_adj to be in the ->mm, it would
simplify a lot of the code :) The problem was the inheritance property:
we expect a job scheduler that is OOM_DISABLE to be able to vfork, change
the oom_score_adj of the child, and then exec so that it is not oom
disabled before starting to allocate memory. If this were in the mm, then
setting the oom_score_adj of the child prior to exec would change the job
scheduler's oom score as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/