Re: [PATCH 4/6] mm, oom: skip vforked tasks from being selected

From: Michal Hocko
Date: Wed Jun 01 2016 - 03:10:13 EST


On Tue 31-05-16 23:43:38, Oleg Nesterov wrote:
> On 05/31, Michal Hocko wrote:
> >
> > On Mon 30-05-16 21:28:57, Oleg Nesterov wrote:
> > >
> > > I don't think we can trust vfork_done != NULL.
> > >
> > > copy_process() doesn't disallow CLONE_VFORK without CLONE_VM, so with this patch
> > > it would be trivial to make the exploit which hides a memory hog from oom-killer.
> >
> > OK, I wasn't aware of this possibility.
>
> Neither was me ;) I noticed this during this review.

Heh, as I've said in other email, this is a land of dragons^Wsurprises.

> > > Or I am totally confused?
> >
> > I cannot judge I am afraid. You are definitely much more familiar with
> > all these subtle details than me.
>
> OK, I just verified that clone(CLONE_VFORK|SIGCHLD) really works to be sure.

great, thanks

> > +/* expects to be called with task_lock held */
> > +static inline bool in_vfork(struct task_struct *tsk)
> > +{
> > + bool ret;
> > +
> > + /*
> > + * need RCU to access ->real_parent if CLONE_VM was used along with
> > + * CLONE_PARENT
> > + */
> > + rcu_read_lock();
> > + ret = tsk->vfork_done && tsk->real_parent->mm == tsk->mm;
> > + rcu_read_unlock();
> > +
> > + return ret;
> > +}
>
> Yes, but may I ask to add a comment? And note that "expects to be called with
> task_lock held" looks misleading, we do not need the "stable" tsk->vfork_done
> since we only need to check if it is NULL or not.

OK, I thought it was needed for the stability but as you explain below
this is not really true...

> It would be nice to explain that
>
> 1. we check real_parent->mm == tsk->mm because CLONE_VFORK does not
> imply CLONE_VM
>
> 2. CLONE_VFORK can be used with CLONE_PARENT/CLONE_THREAD and thus
> ->real_parent is not necessarily the task doing vfork(), so in
> theory we can't rely on task_lock() if we want to dereference it.
>
> And in this case we can't trust the real_parent->mm == tsk->mm
> check, it can be false negative. But we do not care, if init or
> another oom-unkillable task does this it should blame itself.

I've stolen this explanation and put it right there.
--
Michal Hocko
SUSE Labs