Re: proc_flush_task oops

From: Eric W. Biederman
Date: Thu Dec 21 2017 - 03:27:28 EST

Dave Jones <davej@xxxxxxxxxxxxxxxxx> writes:

> On Wed, Dec 20, 2017 at 12:25:52PM -0600, Eric W. Biederman wrote:
>> > >
> > > > If the warning triggers it means the bug is in alloc_pid and somehow
> > > > something has gotten past the is_child_reaper check.
> > >
> > > You're onto something.
> > >
> > I am not seeing where things go wrong, but that puts the recent pid bitmap, bit
> > hash to idr change in the suspect zone.
> >
> > Can you try reverting that change:
> >
> > e8cfbc245e24 ("pid: remove pidhash")
> > 95846ecf9dac ("pid: replace pid bitmap implementation with IDR API")
> >
> > While keeping the warning in place so we can see if this fixes the
> > allocation problem?
> So I can't trigger this any more with those reverted. I seem to hit a
> bunch of other long-standing bugs first. I'll keep running it
> overnight, but it looks like this is where the problem lies.

I would really like to hear from the people who made this change if they
are interested in tracking down this failure.

It might be as simple as the locking changed enough that the locking
instrumentation is now slowing things down, and opening up an old race.

I have stared at this code, and written some test programs and I can't
see what is going on. alloc_pid by design and in implementation (as far
as I can see) is always single threaded when allocating the first pid
in a pid namespace. idr_init always initialized idr_next to 0.

So how we can get past:

if (unlikely(is_child_reaper(pid))) {
if (pid_ns_prepare_proc(ns)) {
goto out_free;

with proc_mnt still set to NULL is a mystery to me.

Is there any chance the idr code doesn't always return the lowest valid
free number? So init gets assigned something other than 1?