Re: while_each_thread() under rcu_read_lock() is broken?

From: Paul E. McKenney
Date: Thu Jun 24 2010 - 23:38:04 EST


On Thu, Jun 24, 2010 at 02:14:46PM -0700, Roland McGrath wrote:
> > First, what "bad things" can happen to a reader scanning a thread
> > group?
> >
> > 1. The thread-group leader might do exec(), destroying the old
> > list and forming a new one. In this case, we want any readers
> > to stop scanning.
>
> This doesn't do anything different (for these concerns) from just all the
> other threads happening to exit right before the exec. There is no
> "destroying the old" and "forming the new", it's just that all the other
> threads are convinced to die now. There is no problem here.

Understood -- I wasn't saying that each category posed a unique problem,
but rather making sure that I really had enumerated all the possibilities.
The reason for my "destroying the old" and "forming the new" is the
possibility of someone doing proc_task_readdir() when the group leader
does exec(), which causes all to die, and then the new process does
pthread_create(), forming a new thread group. Because proc_task_readdir()
neither holds a lock nor stays in an RCU read-side critical section
for the full /proc scan, the old group might really be destroyed from
the reader's point of view.

That said, I freely admit that I am not very familiar with this code.

> > 2. Some other thread might do exec(), destroying the old list and
> > forming a new one. In this case, we also want any readers to
> > stop scanning.
>
> Again, the list is not really destroyed, just everybody dies. What is
> different here is that ->group_leader changes. This is the only time
> that ever happens. Moreover, it's the only time that a task that was
> previously pointed to by any ->group_leader can be reaped before the
> rest of the group has already been reaped first (and thus the
> thread_group made a singleton).

Yep! Same proc_task_readdir() situation as before. The group leader
cannot go away because proc_task_readdir() takes a reference.

> > 3. The thread-group leader might do pthread_exit(), removing itself
> > from the thread group -- and might do so while the hapless reader
> > is referencing that thread.
>
> This is called the delay_group_leader() case. It doesn't happen in a
> way that has the problems you are concerned with. The group_leader
> remains in EXIT_ZOMBIE state and can't be reaped until all the other
> threads have been reaped. There is no time at which any thread in the
> group is in any hashes or accessible by any means after the (final)
> group_leader is reaped.

OK, good to know -- that does make things simpler.

> > 4. Some other thread might do pthread_exit(), removing itself
> > from the thread group, and again might do so while the hapless
> > reader is referencing that thread. In this case, we want
> > the hapless reader to continue scanning the remainder of the
> > thread group.
>
> This is the most normal case (and #1 is effectively just this repeated
> by every thread in parallel).

Agreed. One possible difference is that in #1, no one is going to
complain about the reader quitting, while in this case someone might
well be annoyed if the list of threads is incomplete.

> > 5. The thread-group leader might do exit(), destroying the old
> > list without forming a new one. In this case, we want any
> > readers to stop scanning.
>
> All this means is everybody is convinced to die, and the group_leader
> dies too. It is not discernibly different from #6.

Seems reasonable.

> > 6. Some other thread might do exit(), destroying the old list
> > without forming a new one. In this case, we also want any
> > readers to stop scanning.
>
> This just means everybody is convinced to die and is not materially
> different from each individual thread all happening to die at the same
> time.

Fair enough. Again, my goal was to ensure that I had covered all the
cases as opposed to ensuring that I had described them minimally.

> You've described all these cases as "we want any readers to stop
> scanning". That is far too strong, and sounds like some kind of
> guaranteed synchronization, which does not and need not exist. Any
> reader that needs a dead thread to be off the list holds siglock
> and/or tasklist_lock. For the casual readers that only use
> rcu_read_lock, we only "want any readers' loops eventually to
> terminate and never to dereference stale pointers". That's why
> normal RCU listiness is generally fine.

OK, so maybe "it is OK for readers to stop scanning" is a better way
of putting it?

> The only problem we have is in #2. This is only a problem because
> readers' loops may be using the old ->group_leader pointer as the
> anchor for their circular-list round-robin loop. Once the former
> leader is removed from the list, that loop termination condition can
> never be met.

Does Oleg's checking for the group leader still being alive look correct
to you?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/