Re: [PATCH 4.20 11/50] signal: Always notice exiting tasks
From: Jiri Slaby
Date: Tue Feb 19 2019 - 01:23:49 EST
On 13. 02. 19, 19:38, Greg Kroah-Hartman wrote:
> 4.20-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
>
> commit 35634ffa1751b6efd8cf75010b509dcb0263e29b upstream.
>
> Recently syzkaller was able to create unkillablle processes by
> creating a timer that is delivered as a thread local signal on SIGHUP,
> and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop
> failing to deliver SIGHUP but always trying.
>
> Upon examination it turns out part of the problem is actually most of
> the solution. Since 2.5 signal delivery has found all fatal signals,
> marked the signal group for death, and queued SIGKILL in every threads
> thread queue relying on signal->group_exit_code to preserve the
> information of which was the actual fatal signal.
>
> The conversion of all fatal signals to SIGKILL results in the
> synchronous signal heuristic in next_signal kicking in and preferring
> SIGHUP to SIGKILL. Which is especially problematic as all
> fatal signals have already been transformed into SIGKILL.
>
> Instead of dequeueing signals and depending upon SIGKILL to
> be the first signal dequeued, first test if the signal group
> has already been marked for death. This guarantees that
> nothing in the signal queue can prevent a process that needs
> to exit from exiting.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Tested-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4")
> History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
> Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
This patch breaks strace self-tests in 4.20.9. In particular,
"threads-execve":
https://github.com/strace/strace/blob/master/tests/threads-execve.c
https://github.com/strace/strace/blob/master/tests/threads-execve.test
The test received some fix a day ago, but it did not help in this case:
https://github.com/strace/strace/commit/2a50278b9
Only a revert of the above patch helped.
I don't know if the strace's test is broken (which is quite usual in
cases like these) or the patch affects some user-visible behaviour --
e.g. could this be a reason for sh failures in the build farm?
Any ideas?
The failure is (the test output is non-unified diff: "<" lines are
expected, ">" is actual output from strace):
> FAIL: threads-execve
> ====================
>
> 11,12c11
> < 19311 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7ffc2447c258 /* 63 vars */ <unfinished ...>
> < 19181 <... rt_sigsuspend resumed>) = ?
> ---
>> 19311 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7ffc2447c258 /* 63 vars */ <pid changed to 19181 ...>
> 17,18c16
> < 19395 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffdedb69ee8 /* 63 vars */ <unfinished ...>
> < 19181 <... nanosleep resumed> <unfinished ...>) = ?
> ---
>> 19395 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffdedb69ee8 /* 63 vars */ <pid changed to 19181 ...>
> ...
> 11,12c11
> < 22715 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7fff2ea03388 /* 63 vars */ <unfinished ...>
> < 22657 <... rt_sigsuspend resumed>) = ?
> ---
>> 22715 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7fff2ea03388 /* 63 vars */ <pid changed to 22657 ...>
> 17,18c16
> < 22764 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffc5ea29658 /* 63 vars */ <unfinished ...>
> < 22657 <... nanosleep resumed> <unfinished ...>) = ?
> ---
>> 22764 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffc5ea29658 /* 63 vars */ <pid changed to 22657 ...>
> threads-execve.test: failed test: ../../strace -a21 -f -esignal=none -e trace=execve,exit,nanosleep,rt_sigsuspend ../threads-execve output mismatch
> ---
> kernel/signal.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2393,6 +2393,11 @@ relock:
> goto relock;
> }
>
> + /* Has this task already been marked for death? */
> + ksig->info.si_signo = signr = SIGKILL;
> + if (signal_group_exit(signal))
> + goto fatal;
> +
> for (;;) {
> struct k_sigaction *ka;
>
> @@ -2488,6 +2493,7 @@ relock:
> continue;
> }
>
> + fatal:
> spin_unlock_irq(&sighand->siglock);
>
> /*
>
>
--
js