Re: [RFC][PATCH 09/11] tty_io: Use do_send_sig_info in __do_SACK to forcibly kill tasks

From: Eric W. Biederman
Date: Mon Jul 16 2018 - 15:17:44 EST

Next message: Thomas Gleixner: "Re: [PATCH v3] time: Fix incorrect sleeptime injection when suspend fails"
Previous message: Brijesh Singh: "Re: [PATCH] x86/efi: Access EFI MMIO data as unencrypted when SEV is active"
In reply to: Linus Torvalds: "Re: [RFC][PATCH 09/11] tty_io: Use do_send_sig_info in __do_SACK to forcibly kill tasks"
Next in thread: Linus Torvalds: "Re: [RFC][PATCH 09/11] tty_io: Use do_send_sig_info in __do_SACK to forcibly kill tasks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Mon, Jul 16, 2018 at 8:08 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>>
>> The change for global init is it will now die if init is a member of the
>> session or init is using this tty as it's controlling tty.
>>
>> Semantically killing init with SAK is completely appropriate.
>
> No.
>
> Semtnaitcally killing init is completely wrong. Because it will kill
> the whole system.
>
> And I don't mean that in "now init won't spawn new things". I mean
> that in "now we don't have a child reaper any more, and the system
> will be dead because we'll panic on exit".
>
> So it's not about the controlling tty, it's about fundamental kernel
> internal consistency guarantees.
>
> See
>
> write_unlock_irq(&tasklist_lock);
> if (unlikely(pid_ns == &init_pid_ns)) {
> panic("Attempted to kill init! exitcode=0x%08x\n",
> father->signal->group_exit_code ?: father->exit_code);
> }
>
> in kernel/exit.c.

I should have said it doesn't matter because init does not open ttys and
become a member of session groups. Or at least it never has in my
experience. The only way I know to get that behavior is to boot with
init=/bin/bash.

With the force_sig already in do_SAK today my change is not a
regression. As force_sig in a completely different way forces the
signal to init.

Looking deeper, all of the silliness with SEND_SIG_FORCED and
force_sig_info is to guarantee delivery of synchronous exceptions even
to init.

So I think we want the patch below to clean that up. Then we don't have
to worry about the wrong things sending signals to init by accident, and
SEND_SIG_FORCED becomes just SEND_SIG_PRIV that skips the unnecesary
allocation of a siginfo struct.

Thoughts?

Eric

From: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
Date: Mon, 16 Jul 2018 13:29:04 -0500
Subject: [PATCH] signal: Cleanup delivery of exceptions to init

- Stop clearing SIGNAL_UNKILLABLE. It makes interaction by
the process with other signals problematic, and exceptions
are not necessarily fatal.

- Don't allow SIGKILL and SIGSTOP to the global init.
It never helps and it it can only make things worse.

- Explicitly allow exceptions to any kind of init. They are
exceptions and synchronous and need to be handled somehow.
Init can setup a handler or deal with the default action.

This is not a change it is just code movement from
force_sig_info into send_signal and get_signal.

- Treat all signals from the kernel as if they are from an ancestor
pid namespace.

- Take out the overrides of SIGNAL_UNKILLABLE from force_sig_info
The changes to send_signal and get_signal make them unnecessary.

- Take out the SEND_SIG_FORCED overrides from prepare_signal.
The changes to send_signal makes it redundant.

- Rename force back to from_ancestor_ns as that makes the logic
with respect to namespaces clearer and logically if the kernel
is sending you a signal it is from your ancestor namespace.

Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
---
kernel/signal.c | 41 ++++++++++++++++-------------------------
1 file changed, 16 insertions(+), 25 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 94296afacf44..298f5c690681 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -72,20 +72,21 @@ static int sig_handler_ignored(void __user *handler, int sig)
(handler == SIG_DFL && sig_kernel_ignore(sig));
}

-static int sig_task_ignored(struct task_struct *t, int sig, bool force)
+static int sig_task_ignored(struct task_struct *t, int sig, bool from_ancestor_ns)
{
void __user *handler;

handler = sig_handler(t, sig);

if (unlikely(t->signal->flags & SIGNAL_UNKILLABLE) &&
- handler == SIG_DFL && !(force && sig_kernel_only(sig)))
+ handler == SIG_DFL &&
+ (is_global_init(t) || !(from_ancestor_ns && sig_kernel_only(sig))))
return 1;

return sig_handler_ignored(handler, sig);
}

-static int sig_ignored(struct task_struct *t, int sig, bool force)
+static int sig_ignored(struct task_struct *t, int sig, bool from_ancestor_ns)
{
/*
* Blocked signals are never ignored, since the
@@ -103,7 +104,7 @@ static int sig_ignored(struct task_struct *t, int sig, bool force)
if (t->ptrace && sig != SIGKILL)
return 0;

- return sig_task_ignored(t, sig, force);
+ return sig_task_ignored(t, sig, from_ancestor_ns);
}

/*
@@ -809,7 +810,7 @@ static void ptrace_trap_notify(struct task_struct *t)
* Returns true if the signal should be actually delivered, otherwise
* it should be dropped.
*/
-static bool prepare_signal(int sig, struct task_struct *p, bool force)
+static bool prepare_signal(int sig, struct task_struct *p, bool from_ancestor_ns)
{
struct signal_struct *signal = p->signal;
struct task_struct *t;
@@ -871,7 +872,7 @@ static bool prepare_signal(int sig, struct task_struct *p, bool force)
}
}

- return !sig_ignored(p, sig, force);
+ return !sig_ignored(p, sig, from_ancestor_ns);
}

/*
@@ -1008,8 +1009,7 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
assert_spin_locked(&t->sighand->siglock);

result = TRACE_SIGNAL_IGNORED;
- if (!prepare_signal(sig, t,
- from_ancestor_ns || (info == SEND_SIG_FORCED)))
+ if (!prepare_signal(sig, t, from_ancestor_ns))
goto ret;

pending = (type != PIDTYPE_PID) ? &t->signal->shared_pending : &t->pending;
@@ -1107,12 +1107,8 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
static int send_signal(int sig, struct kernel_siginfo *info, struct task_struct *t,
enum pid_type type)
{
- int from_ancestor_ns = 0;
-
-#ifdef CONFIG_PID_NS
- from_ancestor_ns = si_fromuser(info) &&
+ int from_ancestor_ns = !si_fromuser(info) ||
!task_pid_nr_ns(current, task_active_pid_ns(t));
-#endif

return __send_signal(sig, info, t, type, from_ancestor_ns);
}
@@ -1178,8 +1174,8 @@ int do_send_sig_info(int sig, struct kernel_siginfo *info, struct task_struct *p
* since we do not want to have a signal handler that was blocked
* be invoked when user space had explicitly blocked it.
*
- * We don't want to have recursive SIGSEGV's etc, for example,
- * that is why we also clear SIGNAL_UNKILLABLE.
+ * Exceptions are always delivered so we don't need to worry
+ * about init or other processes blocking exception signals.
*/
int
force_sig_info(int sig, struct kernel_siginfo *info, struct task_struct *t)
@@ -1199,12 +1195,6 @@ force_sig_info(int sig, struct kernel_siginfo *info, struct task_struct *t)
recalc_sigpending_and_wake(t);
}
}
- /*
- * Don't clear SIGNAL_UNKILLABLE for traced tasks, users won't expect
- * debugging to leave init killable.
- */
- if (action->sa.sa_handler == SIG_DFL && !t->ptrace)
- t->signal->flags &= ~SIGNAL_UNKILLABLE;
ret = send_signal(sig, info, t, PIDTYPE_PID);
spin_unlock_irqrestore(&t->sighand->siglock, flags);

@@ -2536,9 +2526,10 @@ int get_signal(struct ksignal *ksig)
continue;

/*
- * Global init gets no signals it doesn't want.
- * Container-init gets no signals it doesn't want from same
- * container.
+ * Except for synchronous exceptions (!SI_FROMUSER)
+ * global init gets no signals it doesn't want, and
+ * container-init gets no signals it doesn't want from
+ * same container.
*
* Note that if global/container-init sees a sig_kernel_only()
* signal here, the signal must have been generated internally
@@ -2546,7 +2537,7 @@ int get_signal(struct ksignal *ksig)
* case, the signal cannot be dropped.
*/
if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
- !sig_kernel_only(signr))
+ !sig_kernel_only(signr) && SI_FROMUSER(&ksig->info))
continue;

if (sig_kernel_stop(signr)) {
--
2.17.1

Next message: Thomas Gleixner: "Re: [PATCH v3] time: Fix incorrect sleeptime injection when suspend fails"
Previous message: Brijesh Singh: "Re: [PATCH] x86/efi: Access EFI MMIO data as unencrypted when SEV is active"
In reply to: Linus Torvalds: "Re: [RFC][PATCH 09/11] tty_io: Use do_send_sig_info in __do_SACK to forcibly kill tasks"
Next in thread: Linus Torvalds: "Re: [RFC][PATCH 09/11] tty_io: Use do_send_sig_info in __do_SACK to forcibly kill tasks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]