Re: main thread pthread_exit/sys_exit bug!

From: Kaz Kylheku
Date: Tue Feb 03 2009 - 18:06:24 EST


On Tue, Feb 3, 2009 at 1:32 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 02/03, Kaz Kylheku wrote:
>>
>> Well, it doesn't bother me that that has to be thrown out.
>> In fact, I do not agree with the requirement that the thread
>> which calls pthread_exit must not respond to signals;
>> the original patch works for me.
>
> What about other users? We can't know what how much they
> depend on the current behaviour.

If they haven't run into this gaping job control issue,
they haven't done a whole lot of testing, obviously!

Those who have run into it would certainly have to implement
a workaround --- like not calling pthread_exit from the main
thread!

I.e. ``Q: Doctor, it hurts when I do this; A: So don't
do that!''.

> OK, OK. Please forget about signals, futexes, etc.
> Simple program:
>
> pthread_t main_thread;
>
> void *tfunc(void *a)
> {
> pthread_joni(main_thread, NULL);
> return NULL;
> }
>
> int main(void)
> {
> pthread_t thr;
>
> main_thread = pthread_self();
> pthread_create(&thr, NULL, tfunc, NULL);
> pthread_exit(NULL);
> }

This test case appears to be conforming, so it
has to work.

The initial thread is considered joinable.

For instance a Rationale note in Issue 6 of the
SUS claims that one reason for the existence of
pthread_detach is so that the initial thread could
be detached, which cannot be done through thread
creation attributes for that thread. So the intent
is clearly that the initial thread is joinable!

> I bet this will hang with your patch applied.

It almost certainly will, and it does have to do with
futexes.

The main thread hasn't gone through the step where
it clears the TID, so the lll_wait_tid
futex wait in pthread_join will block. There is no
short-circuit indication in the library to indicate that
the main thread has died.

This TID trick is analogous to the robust list clean up. It's the same
kind of thing: fixing up a value of some registered
user-space memory location, signaling.

> Kaz, you know, it is not easy to say "you patch is wrong
> in any case, no matter how much it will be improved" ;)

Sure it is!

I will save you from that, because I do not believe in piling
hacks on top of hacks to fix something that may be
the wrong approach, even in situations where there is a
good chance that after some finite number of hacks,
it will finally be right. I did that in LinuxThreads once upon
a time (mind you, that was so great, FreeBSD had to have it!)

I do think there is a clean, non-hacky way to reason
about this.

If we think about the process-containing-threads model that
the kernel is trying to emulate, and what should happen
when threads exit, we come to the following reasoning:

When a thread exits, there does have to be certain cleanup
with respect to that thread. But the process-wide cleanup
is not performed until all the threads are gone (thread
count hits zero). This is easy to implement under the
process-contains-threads model.

These actions are not cleanly separated in Linux. There
is a do_exit function which handles both the thread-things
and the process-things in one swoop, so to speak.

The zombie problem occurs because do_exit goes too
far, cleaning up things that it shouldn't; things that
are necessary in holding up the POSIX-conforming
illusion that there is a process that contains threads.

My kneejerk approch was: hey wait, let's hold back
from doing /anything/ in do_exit; in fact let's not
call it at all if we're the initial thread and there are
still others. But obviously, it's not just anything in
do_exit that causes problems. Maybe some in-between
approach can work. The pthread_join test case can
be fixed in a clean way, as can robust cleanup.

The thread can signal pthread_join by resetting its TID
to zero and hitting the futex. It can do the robust
list cleanup, etc.

If you can identify a good separation about what to do
first, and what to do later, maybe you can some decent
compromise among the concerns. Breaking up the
do_exit logic into

do_exit_thread
do_exit_process

would probably not hurt. You then have to pick whether
each action belongs to one or the other and stick
it into the appropriate function, with some clear
guidelines about what goes where.

In sys_exit, the two pieces could be used somehow like this:

do_exit_thread();

if (leader and not_empty(group))
{ special_logic(); }

do_exit_process();

As one rule (for instance), any cleanup that threatens
the integrity of the process/thread model goes into do_exit_process.

So for instance if you ptrace that process, it still has all of its
memory areas intact (you don't have to look for a different
PID in the process list in order to find them).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/