Re: proc_flush_task oops

From: Eric W. Biederman
Date: Tue Dec 19 2017 - 16:45:02 EST

Dave Jones <davej@xxxxxxxxxxxxxxxxx> writes:

> On Tue, Dec 19, 2017 at 12:27:30PM -0600, Eric W. Biederman wrote:
> > Dave Jones <davej@xxxxxxxxxxxxxxxxx> writes:
> >
> > > On Mon, Dec 18, 2017 at 03:50:52PM -0800, Linus Torvalds wrote:
> > >
> > > > But I don't see what would have changed in this area recently.
> > > >
> > > > Do you end up saving the seeds that cause crashes? Is this
> > > > reproducible? (Other than seeing it twoce, of course)
> > >
> > > Only clue so far, is every time I'm able to trigger it, the last thing
> > > the child process that triggers it did, was an execveat.
> >
> > Is there any chance the excveat might be called from a child thread?
> If trinity choose one of the exec syscalls, it forks off an extra child
> to do it in, on the off-chance that it succeeds, and we never return.

extrapid = fork();
if (extrapid == 0) {
/* grand-child */
char childname[]="trinity-subchild";
prctl(PR_SET_NAME, (unsigned long) &childname);

__do_syscall(rec, GOING_AWAY);
/* if this was for eg. an successful execve, we should never get here.
* if it failed though... */

That is interesting.

So the system call sequence is a fork which just succeeded and than an
exec. That reduces the possibilities quite a lot.

With pids there was a recent change that just replaced the pid hash
table and the pid bitmap with and idr. It changes the locking somewhat
and probably changes the timing so that might be the culprit.

I am trying to figure out if there is an interface that would let
ns_last_pid for a pid namespace be accessed before the first pid is
allocated and I am not seeing it. It does not appear to be possible
to mount a proc for a pid namespace you are not currently in.

*Scratches my head* I am not seeing anything obvious.