Re: [PATCH -v2 2/5] OOM: thaw the OOM victim if it is frozen

From: Michal Hocko
Date: Sun Dec 07 2014 - 05:24:38 EST


On Sat 06-12-14 08:06:57, Tejun Heo wrote:
> Hello,
>
> On Fri, Dec 05, 2014 at 05:41:44PM +0100, Michal Hocko wrote:
> > oom_kill_process only sets TIF_MEMDIE flag and sends a signal to the
> > victim. This is basically noop when the task is frozen though because
> > the task sleeps in uninterruptible sleep. The victim is eventually
> > thawed later when oom_scan_process_thread meets the task again in a
> > later OOM invocation so the OOM killer doesn't live lock. But this is
> > less than optimal. Let's add the frozen check and thaw the task right
> > before we send SIGKILL to the victim.
> >
> > The check and thawing in oom_scan_process_thread has to stay because the
> > task might got access to memory reserves even without an explicit
> > SIGKILL from oom_kill_process (e.g. it already has fatal signal pending
> > or it is exiting already).
>
> How else would a task get TIF_MEMDIE? If there are other paths which
> set TIF_MEMDIE, the right thing to do is creating a function which
> thaws / wakes up the target task and use it there too. Please
> interlock these things properly from the get-go instead of scattering
> these things around.

See __out_of_memory which sets TIF_MEMDIE on current when it is exiting
or has fatal signals pending. This task cannot be frozen obviously.

> > @@ -545,6 +545,8 @@ void oom_kill_process(struct task_struct *p, gfp_t gfp_mask, int order,
> > rcu_read_unlock();
> >
> > mark_tsk_oom_victim(victim);
> > + if (frozen(victim))
> > + __thaw_task(victim);
>
> The frozen() test here is racy. Always calling __thaw_task() wouldn't
> be. You can argue that being racy here is okay because the later
> scanning would find it but why complicate things like that? Just
> properly interlock each instance and be done with it.

OK, changed. I didn't realize that __thaw_task does the check already
and was following what we have in oom_scan_process_thread. Removed the
check from that one as well.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/