Re: main thread pthread_exit/sys_exit bug!

From: Oleg Nesterov
Date: Tue Feb 03 2009 - 16:35:29 EST


On 02/03, Kaz Kylheku wrote:
>
> Well, it doesn't bother me that that has to be thrown out.
> In fact, I do not agree with the requirement that the thread
> which calls pthread_exit must not respond to signals;
> the original patch works for me.

What about other users? We can't know what how much they
depend on the current behaviour.

> I.e. in my embedded GNU/Linux distro, that requirement
> doesn't exist. And since I can't find it in the Single
> Unix Specification, so much for that!
>
> Nothing in the spec says that once pthread_exit is called,
> signals are stopped. This function invokes cleanup handling,
> and thread-specific-storage destruction. During any of those
> tasks, signals can still be happening. Any of those
> tasks can easily enter into an indefinite wait. What if
> a cleanup handler performs a blocking RPC to a remote
> server? Well, there you are, stuck in pthread_exit,
> handling signals, and not cleaning up your robust list, etc.
>
> I also don't require robust locks to be cleaned up
> instantly if they are owned by a main thread that has
> called pthread_exit.

OK, OK. Please forget about signals, futexes, etc.
Simple program:

pthread_t main_thread;

void *tfunc(void *a)
{
pthread_joni(main_thread, NULL);
return NULL;
}

int main(void)
{
pthread_t thr;

main_thread = pthread_self();
pthread_create(&thr, NULL, tfunc, NULL);
pthread_exit(NULL);
}

I bet this will hang with your patch applied. Because
we depend on sys_futex(->clear_child_tid, FUTEX_WAKE, ...).

Kaz, you know, it is not easy to say "you patch is wrong
in any case, no matter how much it will be improved" ;)
But even if the current behaviour is not optimal, we must not
change it unless we think it leads to bugs. We can't know
which application can suffer. The current behaviour is old.

> Face it, allowing the thread leader to exit is as wrong as doing
> other stupid things to the leader, like unsharing the signal
> handler.

Perhaps. That is why I said _something_ like your patch perhaps
makes sense. But this is tricky, and I don't see a simple/clean
way to improve things. And, otoh, I do not see _real_ problems
with the zombie leaders.


As for original problem, it should be fixed anyway. wait_task_stopped()
should take SIGNAL_STOP_STOPPED into account, not task->state.
Unless we are ptracer, of course.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/