Re: [PATCH 4.20 11/50] signal: Always notice exiting tasks
From: Greg Kroah-Hartman
Date: Tue Feb 19 2019 - 04:07:21 EST
On Tue, Feb 19, 2019 at 07:23:41AM +0100, Jiri Slaby wrote:
> On 13. 02. 19, 19:38, Greg Kroah-Hartman wrote:
> > 4.20-stable review patch. If anyone has any objections, please let me know.
> > ------------------
> > From: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> > commit 35634ffa1751b6efd8cf75010b509dcb0263e29b upstream.
> > Recently syzkaller was able to create unkillablle processes by
> > creating a timer that is delivered as a thread local signal on SIGHUP,
> > and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop
> > failing to deliver SIGHUP but always trying.
> > Upon examination it turns out part of the problem is actually most of
> > the solution. Since 2.5 signal delivery has found all fatal signals,
> > marked the signal group for death, and queued SIGKILL in every threads
> > thread queue relying on signal->group_exit_code to preserve the
> > information of which was the actual fatal signal.
> > The conversion of all fatal signals to SIGKILL results in the
> > synchronous signal heuristic in next_signal kicking in and preferring
> > SIGHUP to SIGKILL. Which is especially problematic as all
> > fatal signals have already been transformed into SIGKILL.
> > Instead of dequeueing signals and depending upon SIGKILL to
> > be the first signal dequeued, first test if the signal group
> > has already been marked for death. This guarantees that
> > nothing in the signal queue can prevent a process that needs
> > to exit from exiting.
> > Cc: stable@xxxxxxxxxxxxxxx
> > Tested-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4")
> > History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
> > Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> This patch breaks strace self-tests in 4.20.9. In particular,
> The test received some fix a day ago, but it did not help in this case:
> Only a revert of the above patch helped.
> I don't know if the strace's test is broken (which is quite usual in
> cases like these) or the patch affects some user-visible behaviour --
> e.g. could this be a reason for sh failures in the build farm?
> Any ideas?
Does cf43a757fd49 ("signal: Restore the stop PTRACE_EVENT_EXIT") help
with this? It's queued up for the next round of stable releases and is
in Linus's tree.