Re: [PATCH 6/6] clone4: Introduce new CLONE_FD flag to get task exit notification via fd

From: Josh Triplett
Date: Sat Mar 14 2015 - 16:03:36 EST


On Sat, Mar 14, 2015 at 08:18:36PM +0100, Oleg Nesterov wrote:
> On 03/14, Josh Triplett wrote:
> >
> > On Sat, Mar 14, 2015 at 11:38:29AM -0700, Thiago Macieira wrote:
> > > On Saturday 14 March 2015 15:32:35 Oleg Nesterov wrote:
> > > > It is not clear to me what do_wait() should do with ->autoreap child, even
> > > > ignoring ptrace.
> > > >
> > > > Just suppose that real_parent has a single "autoreap" child. Should
> > > > wait(NULL) hanf then?
> > >
> > > It should ignore the child that is set to autoreap. wait(NULL) should return -
> > > ECHILD, indicating there are no children waiting to be reaped.
> >
> > Right. And I don't think the current code does this. I think we need
> > to change wait_consider_task to early-return for ->autoreap just as it
> > does for task_state == EXIT_DEAD.
>
> No. This EXIT_DEAD is absolutely different. And this is another indication
> that you might use it wrongly ;)

Is there any information somewhere on how this state machine of doom is
*supposed* to work? :) Why would "p->task_state == EXIT_DEAD" mean
something different in wait_consider_task?

> What we actually want is BUG_ON(task_state == EXIT_DEAD) here. We do not
> want the EXIT_DEAD tasks in ->children/ptraced lists. These EXIT_DEAD tasks
> complicate the exit/wait/reparent paths.

Pulling the EXIT_DEAD tasks out of those lists completely does sound
like a good simplification. However, that doesn't seem to be the
current expectation in wait_consider_task, which just returns if
p->task_state == EXIT_DEAD to skip considering that task.

And an autoreaping task isn't necessarily dead yet; it just shouldn't be
waited on.

> However, currently this is TODO. The main problem is the locking in
> wait_task_zombie(), we can set EXIT_DEAD and remove the task from list
> under read_lock().

That appears to be only reachable for zombies, which an autoreaping task
should never become.

> And please see another email from me. So far I disagree that wait(NULL)
> should return ECHILD unconditionally. At least unless this is discussed
> separately.

I'll respond in that separate thread, but one issue there: waiting for
any child process cannot safely return an autoreaping child process,
because that would introduce a race condition. The PID the parent gets
back can disappear at any time, so there's nothing useful the parent can
do with it.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/