Re: Thread group exec race -> null pointer... HELP

From: George Anzinger
Date: Wed Nov 23 2005 - 15:33:16 EST

Oleg Nesterov wrote:
George Anzinger wrote:

Still rooting around in the above. The test program is attached. It
creates and arms a repeating timer and then clones a thread which does
an exec() call.

This patch:

was intended to fix exactly this problem (and the same test program was
used to exploit the race and test the fix).

So, it does not help? I can't reproduce the problem.

Yes, it does fix it. Somehow I missed the posting of that patch.

Note: I think you also need this patch:

otherwise I beleive OOPS can happen while killing this program if you are
running the kernel with this change applied:

[PATCH] Call exit_itimers from do_exit, not __exit_signal;a=commit;h=25f407f0b668f5e4ebd5d13e1fb4306ba6427ead

first instance of this, we see that the thread-group leader is dead
and the exec code at line ~718 is setting the old leaders group-leader
to him self.

I think this code at line ~718

leader->group_leader = leader;

is noop, because leader->group_leader == leader here.

- leader->group_leader = leader;
+ leader->group_leader = current;

This can't help, without SIGEV_THREAD_ID we don't check ->group_leader,
the signal goes to the thread group via timer->it_process, which is equal
to the old leader.

The signal code returns <0 so posix-timers digs into up the group_leader and trys again. Still, the patch fixes it all.

George Anzinger george@xxxxxxxxxx
