Re: heavy handed exit() in latest BK

From: Roland McGrath (roland@redhat.com)
Date: Sat Feb 08 2003 - 21:17:52 EST


> By then we also have local interrupts disabled, and we've explicitly
> disabled preemption, so I don't see how anything could ever wake us up any
> more.

We already know how this happens. I think it's only possible with SMP.
I described this very problem in the final paragraph of my penultimate
message on the signals changes:

    Incidentally, I've run across another bug introduced by the last rework of
    handle_stop_signal (or perhaps similar races have always been there, I'm
    not quite sure at the moment). It can call wake_up_process on a zombie
    that's on its way to exit, triggering the BUG at the end of do_exit. I
    think this race may be possible in all of the signal_wake_up calls for
    SIGKILL cases, and other uses of wake_up_process like PTRACE_KILL.
    Some such places check ->state, but they do not lock out exit races.
    Perhaps having wake_up_process itself be race-proof would be simplest.
    I don't have a good sense of how best to fix this one yet.

This will probably stop biting anyone in practice after the most recent new
plan for SIGKILL we've just been discussing. For signals, the race will
only be possible for SIGCONT sent when a thread is on its way to die. That
can be avoided by checking PF_EXITING in handle_stop_signal, because after
setting PF_EXITING any thread in do_exit will take the siglock and thus
can't have gotten far enough to go to TASK_ZOMBIE without being serialized
after the loop in handle_stop_signal.

As I said above, I think this race is possible in other uses of
wake_up_process. PTRACE_KILL is one example, but there are others and I
would have to check carefully to be convinced that other factors rule out
this exit race for them. I think that BUG_ON check should definitely go
into try_wake_up so that it hits should any of these other races ever
actually bite. Unless I am missing something, it won't necessarily catch
all races unless you use xchg to set TASK_RUNNING and then check the old value.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Feb 15 2003 - 22:00:20 EST