Re: [PATCH] mm,oom: Exclude TIF_MEMDIE processes from candidates.

From: Michal Hocko
Date: Mon Jan 11 2016 - 10:18:42 EST


On Fri 08-01-16 00:38:43, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > @@ -333,6 +333,14 @@ static struct task_struct *select_bad_process(struct oom_control *oc,
> > if (points == chosen_points && thread_group_leader(chosen))
> > continue;
> >
> > + /*
> > + * If the current major task is already ooom killed and this
> > + * is sysrq+f request then we rather choose somebody else
> > + * because the current oom victim might be stuck.
> > + */
> > + if (is_sysrq_oom(sc) && test_tsk_thread_flag(p, TIF_MEMDIE))
> > + continue;
> > +
> > chosen = p;
> > chosen_points = points;
> > }
>
> Do we want to require SysRq-f for each thread in a process?
> If g has 1024 p, dump_tasks() will do
>
> pr_info("[%5d] %5d %5d %8lu %8lu %7ld %7ld %8lu %5hd %s\n",
>
> for 1024 times? I think one SysRq-f per one process is sufficient.

I am not following you here. If we kill the process the whole process
group (aka all threads) will get killed which ever thread we happen to
send the sigkill to.

> How can we guarantee that find_lock_task_mm() from oom_kill_process()
> chooses !TIF_MEMDIE thread when try_to_sacrifice_child() somehow chose
> !TIF_MEMDIE thread? I think choosing !TIF_MEMDIE thread at
> find_lock_task_mm() is the simplest way.

find_lock_task_mm chosing TIF_MEMDIE thread shouldn't change anything
because the whole thread group will go down anyway. If you want to
guarantee that the sysrq+f never choses a task which has a TIF_MEMDIE
thread then we would have to check for fatal_signal_pending as well
AFAIU. Fiddling with find find_lock_task_mm will not help you though
unless I am missing something.
--
Michal Hocko
SUSE Labs