Re: [PATCH] x86_64: fix delayed signals

From: Linus Torvalds
Date: Thu Jul 10 2008 - 22:23:48 EST




On Thu, 10 Jul 2008, Linus Torvalds wrote:
>
> So now I'm considering just putting it in before the 2.6.26 release after
> all ;)

.. and having looked at the code, and thought about it some more, I'm
definitely off the patch again.

The reason is actually exactly the same bug that showed up when you did
this for x86-32 three years ago, and that may in fact still be lurking.

The endless loop of "call do_notify_resume until all the work flags are
zero" is very fragile: it will immediately cause a hard lockup if there is
some circumstance where do_notify_resume will not clear the flag.

And when it comes to signals, there are several cases that can cause
TIF_SIGPENDING to not be cleared:

- confusion about user/kernel mode, where "do_signal()" will return
without doing anything at all if we're in user mode.

This was the bug we hit back in 2005 with a out-of-tree kernel-based
vm86 model (which hopefully has since died a painful death).

- get_signal_to_deliver() returning and not handling the signal.
dequeue_signal() will do this for that collect_signal() case and for
the whole DRI notifier thing. The DRI notifier() case actually clears
TIF_SIGPENDING, but then we do "recalc_sigpending()" in the caller, so
it might get set again.

I do hate that code (I know you do too), and the code _should_ block
the signal that gets ignored (so recalc_sigpending() should keep it
cleared), but it's not entirely obvious. Maybe it gets into an endless
loop of calling the notifier if this case ever triggers?

- recalc_sigpending() expressly does not clear the TIF_SIGPENDING flag if
we hit the "freezing(current)" case. So TIF_SIGPENDING stays set for
freezing() processes. I think (and *hope*) they all get caught by other
means anyway in that whole do_notify_resume() loop, but this is another
of those "the freezer code is insane, I'm not going to try to think it
through" cases.

In short, I think your patch is fine now, but I'm also nervous enough
about it that I'm not going to apply it. Any bugs it could expose look
very unlikely, and if they exist they are probably bugs on 32-bit as we
speak, but call me a worry-wart.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/