Re: Thread group exec race -> null pointer... HELP

From: George Anzinger
Date: Wed Nov 23 2005 - 15:33:16 EST


Oleg Nesterov wrote:
George Anzinger wrote:

Still rooting around in the above. The test program is attached. It
creates and arms a repeating timer and then clones a thread which does
an exec() call.


This patch:

http://marc.theaimsgroup.com/?l=linux-kernel&m=113138286512847

was intended to fix exactly this problem (and the same test program was
used to exploit the race and test the fix).

So, it does not help? I can't reproduce the problem.

Yes, it does fix it. Somehow I missed the posting of that patch.

Note: I think you also need this patch:

http://marc.theaimsgroup.com/?l=linux-kernel&m=113059955626598

otherwise I beleive OOPS can happen while killing this program if you are
running the kernel with this change applied:

[PATCH] Call exit_itimers from do_exit, not __exit_signal
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=25f407f0b668f5e4ebd5d13e1fb4306ba6427ead



first instance of this, we see that the thread-group leader is dead
and the exec code at line ~718 is setting the old leaders group-leader
to him self.


I think this code at line ~718

leader->group_leader = leader;

is noop, because leader->group_leader == leader here.


- leader->group_leader = leader;
+ leader->group_leader = current;


This can't help, without SIGEV_THREAD_ID we don't check ->group_leader,
the signal goes to the thread group via timer->it_process, which is equal
to the old leader.

The signal code returns <0 so posix-timers digs into up the group_leader and trys again. Still, the patch fixes it all.

--
George Anzinger george@xxxxxxxxxx
HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/