Re: [PATCH 07/10] mm, oom: fortify task_will_free_mem

From: Michal Hocko
Date: Fri Jun 17 2016 - 08:26:55 EST


On Fri 17-06-16 20:38:01, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > > > Anyway, would you be OK with the patch if I added the current->mm check
> > > > and resolve its necessity in a separate patch?
> > >
> > > Please correct task_will_free_mem() in oom_kill_process() as well.
> >
> > We cannot hold task_lock over all task_will_free_mem I am even not sure
> > we have to develop an elaborate way to make it raceless just for the nommu
> > case. The current case is simple as we cannot race here. Is that
> > sufficient for you?
>
> We can use find_lock_task_mm() inside mark_oom_victim().
> That is, call wake_oom_reaper() from mark_oom_victim() like
>
> void mark_oom_victim(struct task_struct *tsk, bool can_use_oom_reaper)
> {
> WARN_ON(oom_killer_disabled);
> /* OOM killer might race with memcg OOM */
> tsk = find_lock_task_mm(tsk);
> if (!tsk)
> return;
> if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE)) {
> task_unlock(tsk);
> return;
> }
> task_unlock(tsk);
> atomic_inc(&tsk->signal->oom_victims);
> /*
> * Make sure that the task is woken up from uninterruptible sleep
> * if it is frozen because OOM killer wouldn't be able to free
> * any memory and livelock. freezing_slow_path will tell the freezer
> * that TIF_MEMDIE tasks should be ignored.
> */
> __thaw_task(tsk);
> atomic_inc(&oom_victims);
> if (can_use_oom_reaper)
> wake_oom_reaper(tsk);
> }
>
> and move mark_oom_victim() by normal path to after task_unlock(victim).
>
> do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
> - mark_oom_victim(victim);
>
> - if (can_oom_reap)
> - wake_oom_reaper(victim);
> + wake_oom_reaper(victim, can_oom_reap);

I do not like this because then we would have to check the reapability
from inside the oom_reaper again.

But let me ask again. Does this really matter so much just because of
nommu where we can fall in different traps? Can we simply focus on mmu
(aka vast majority of cases) make it work reliably and see what we can
do with nommu later?

--
Michal Hocko
SUSE Labs