Re: [patch 06/12] mm: oom_kill: simplify OOM killer locking

From: Johannes Weiner
Date: Thu Mar 26 2015 - 11:17:57 EST


On Thu, Mar 26, 2015 at 02:31:11PM +0100, Michal Hocko wrote:
> On Wed 25-03-15 02:17:10, Johannes Weiner wrote:
> > The zonelist locking and the oom_sem are two overlapping locks that
> > are used to serialize global OOM killing against different things.
> >
> > The historical zonelist locking serializes OOM kills from allocations
> > with overlapping zonelists against each other to prevent killing more
> > tasks than necessary in the same memory domain. Only when neither
> > tasklists nor zonelists from two concurrent OOM kills overlap (tasks
> > in separate memcgs bound to separate nodes) are OOM kills allowed to
> > execute in parallel.
> >
> > The younger oom_sem is a read-write lock to serialize OOM killing
> > against the PM code trying to disable the OOM killer altogether.
> >
> > However, the OOM killer is a fairly cold error path, there is really
> > no reason to optimize for highly performant and concurrent OOM kills.
> > And the oom_sem is just flat-out redundant.
> >
> > Replace both locking schemes with a single global mutex serializing
> > OOM kills regardless of context.
>
> OK, this is much simpler.
>
> You have missed drivers/tty/sysrq.c which should take the lock as well.
> ZONE_OOM_LOCKED can be removed as well. __out_of_memory in the kerneldoc
> should be renamed.

Argh, an older version had the lock inside out_of_memory() and I never
updated the caller when I changed the rules. Thanks. I'll fix both.

> > @@ -795,27 +728,21 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
> > */
> > void pagefault_out_of_memory(void)
> > {
> > - struct zonelist *zonelist;
> > -
> > - down_read(&oom_sem);
> > if (mem_cgroup_oom_synchronize(true))
> > - goto unlock;
> > + return;
>
> OK, so we are back to what David has asked previously. We do not need
> the lock for memcg and oom_killer_disabled because we know that no tasks
> (except for potential oom victim) are lurking around at the time
> oom_killer_disable() is called. So I guess we want to stick a comment
> into mem_cgroup_oom_synchronize before we check for oom_killer_disabled.

I would prefer everybody that sets TIF_MEMDIE and kills a task to hold
the lock, including memcg. Simplicity is one thing, but also a global
OOM kill might not even be necessary when it's racing with the memcg.

> After those are fixed, feel free to add
> Acked-by: Michal Hocko <mhocko@xxxxxxx>

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/