Re: strace lockup when tracing exec in go

From: Oleg Nesterov
Date: Fri Sep 23 2016 - 09:22:05 EST


On 09/23, Michal Hocko wrote:
>
> On Fri 23-09-16 12:21:41, Oleg Nesterov wrote:
> > On 09/22, Michal Hocko wrote:
> > >
> > > --- a/kernel/signal.c
> > > +++ b/kernel/signal.c
> > > @@ -91,6 +91,10 @@ static int sig_ignored(struct task_struct *t, int sig, bool force)
> > > if (!sig_task_ignored(t, sig, force))
> > > return 0;
> > >
> > > + /* Do not ignore signals sent from child to the parent */
> > > + if (current->ptrace && current->parent == t)
> > > + return 0;
> >
> > This doesn't look right in general, and this can't really help.
> >
> > This assumes that the tracer will call do_wait() after mm_access()
> > fails, but this is not necessarily true.
> >
> > Note also ptrace_attach(), -ERESTARTNOINTR means that the tracer won't
> > even return to user-space if SIGCHLD is ignored, the tracer will silently
> > restart the syscall.
>
> Well, it apparently does help the strace case.

Only because strace doesn't even try to handle -EINTR; it assumes this is not
possible, gives up, and calls wait() after that. So this change actually
breaks strace.

And once again, this can't really help. SIGCHLD can come before strace calls
process_vm_readv(), and in this case it will enter the syscall without
signal_pending() == T. IOW, this hack can only help if the tracer already
sleeps in process_vm_readv().

Plus, again, "strace -f" can equally hang if mt-exec races with PTRACE_ATTACH.

> So I am not arguing this
> is the best fix but can it be harmful?

This change is simply wrong no matter what. We could change do_notify_parent()
to call signal_wake_up() if tsk->ptrace, but see above, this won't help.

Oleg.