Re: [PATCH] oom: consider multi-threaded tasks in task_will_free_mem

From: Michal Hocko
Date: Tue May 17 2016 - 16:25:52 EST


On Tue 17-05-16 20:42:25, Oleg Nesterov wrote:
> On 04/12, Michal Hocko wrote:
> >
> > We shouldn't consider the task
> > unless the whole thread group is going down.
>
> Yes, agreed. I'd even say that oom-killer should never look at individual
> task/threads, it should work with mm's. And one of the big mistakes (imo)
> was the s/for_each_process/for_each_thread/ change in select_bad_process()
> a while ago.
>
> Michal, I won't even try to actually review this patch, I lost any hope
> to understand OOM-killer a long ago ;) But I do agree with this change,
> we obviously should not rely on PF_EXITING.
>
> > static inline bool task_will_free_mem(struct task_struct *task)
> > {
> > + struct signal_struct *sig = task->signal;
> > +
> > /*
> > * A coredumping process may sleep for an extended period in exit_mm(),
> > * so the oom killer cannot assume that the process will promptly exit
> > * and release memory.
> > */
> > - return (task->flags & PF_EXITING) &&
> > - !(task->signal->flags & SIGNAL_GROUP_COREDUMP);
> > + if (sig->flags & SIGNAL_GROUP_COREDUMP)
> > + return false;
> > +
> > + if (!(task->flags & PF_EXITING))
> > + return false;
> > +
> > + /* Make sure that the whole thread group is going down */
> > + if (!thread_group_empty(task) && !(sig->flags & SIGNAL_GROUP_EXIT))
> > + return false;
> > +
> > + return true;
>
> So this looks certainly better to me, but perhaps it should do
>
> if (SIGNAL_GROUP_COREDUMP)
> return false;
>
> if (SIGNAL_GROUP_EXIT)
> return true;
>
> if (thread_group_empty() && PF_EXITING)
> return true;
>
> return false;
>
> ?
>
> I won't insist, I do not even know if this would be better or not. But if
> SIGNAL_GROUP_EXIT is set all sub-threads should go away even if PF_EXITING
> is not set yet because this task didn't dequeue SIGKILL yet.
>
> Up to you in any case.

I have structured the checks this way because I expect I would like to
have all early break outs as false. This will help when we want to
extend those with further more specific checks. E.g. if the task is
sharing the mm with another thread group.

Anyway thanks for the review Oleg!
--
Michal Hocko
SUSE Labs