Re: [PATCH 3/4] tty: Iterate only thread group leaders in __do_SAK()

From: Kirill Tkhai
Date: Tue Jan 16 2018 - 06:34:19 EST


On 15.01.2018 23:51, Oleg Nesterov wrote:
> On 01/15, Kirill Tkhai wrote:
>>
>> On 12.01.2018 19:42, Oleg Nesterov wrote:
>>
>>> IOW, I do not understand why we can't simply use rcu_read_lock() after
>>> do_each_pid_task/while_each_pid_task. Yes we can miss the new process/thread,
>>> but if the creator process had this tty opened it should be killed by us so
>>> fork/clone can't succeed: both do_fork() and send_sig() take the same lock
>>> and do_fork() checks signal_pending() under ->siglock.
>>>
>>> No?
>>
>> Yes, but we send signal not every time. So, this was the only reason I added
>> lock/unlock the locks. But anyway, __do_SAK() is racy and the effect of that
>> is minimal, so it seems we may skip this.
>
> Yes. If we don't send SIGKILL we do not care about the new child process/thread
> we can miss, it can't have this tty opened at fork() time. If the child opens
> this tty after that, __do_SAK can "miss" it anyway in that it can see it before
> it does open(tty).
>
>> I tested your patch with small modification in "struct files_struct *files;" ('*' is added).
>> Could I send it with your "Signed-off-by" as the second version?
>
> Yes, please feel free,
>
>> kill:
>> - force_sig(SIGKILL, p);
>> + send_sig(SIGKILL, p, 1);
>
> Agreed, I didn't actually want to use force_sig(SIGKILL), copy-and-paste error.

force_sig() is still safe under tasklist_lock as release_task() unhashes a task
from the lists and destroys sighand at the same time under it. So, it seems
there is no a problem :)

Anyway, we could use send_sig_info(SIGKILL, SEND_SIG_FORCED, p) instead of that
in the patch like you suggested below.

> But. on the second thought this probably needs another change... I don't understand
> these force_sig/send_sig in __do_SAK().
>
> If signal->tty == tty it does send_sig(SIGKILL), this won't kill the global or
> sub-namespace init.
>
> However, if iterate_fd() finds this tty it does force_sig(SIGKILL) which clears
> SIGNAL_UNKILLABLE, so it can kill even the global init.

Also we skip global init on session iteration. This could be useful for debugging,
when init is "/bin/bash" and some task started on top of bash is hunged.

> This looks strange, and probably unintentional. So it seems yoou should start
> with "revert 20ac94378 [PATCH] do_SAK: Don't recursively take the tasklist_lock" ?
> The original reason for that commit has gone a long ago.

If we revert it, lock_task_sighand() will be nested with task_lock(). I'm not
sure either there is an example of such nesting in kernel. Yeah, it's not for
a long time, next commit will change that. But anyway. Maybe, we add global
init check there and send_sig_info(SIGKILL, SEND_SIG_FORCED, p)? Please, see
patches on the bottom of this message.

> At the same time, I do not know if we actually want to kill sub-namespace inits
> or not. If yes, we can use SEND_SIG_FORCED (better than ugly force_sig()) but
> skip the global init. But this will need yet another change.

>From https://www.kernel.org/doc/Documentation/SAK.txt:

"An operating system's Secure Attention Key is a security tool which is
provided as protection against trojan password capturing programs. It
is an undefeatable way of killing all programs which could be
masquerading as login applications"

It seems, since not privileged user is able to create pid_ns to start
a "trojan password capturing program", we have to kill sub-namespace inits too.

Maybe we'll use something like this? (With a hope, two patches in one message is readable)

===============================[PATCH 1]====================================================
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index dc60aeea87d8..0df71480762a 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2719,27 +2719,34 @@ void __do_SAK(struct tty_struct *tty)
read_lock(&tasklist_lock);
/* Kill the entire session */
do_each_pid_task(session, PIDTYPE_SID, p) {
+ if (unlikely(is_global_init(p)))
+ continue;
tty_notice(tty, "SAK: killed process %d (%s): by session\n",
task_pid_nr(p), p->comm);
- send_sig(SIGKILL, p, 1);
+ send_sig_info(SIGKILL, SEND_SIG_FORCED, p);
} while_each_pid_task(session, PIDTYPE_SID, p);

/* Now kill any processes that happen to have the tty open */
do_each_thread(g, p) {
+ if (unlikely(is_global_init(p)))
+ continue;
+
if (p->signal->tty == tty) {
tty_notice(tty, "SAK: killed process %d (%s): by controlling tty\n",
task_pid_nr(p), p->comm);
- send_sig(SIGKILL, p, 1);
- continue;
+ goto kill;
}
task_lock(p);
i = iterate_fd(p->files, 0, this_tty, tty);
+ task_unlock(p);
if (i != 0) {
tty_notice(tty, "SAK: killed process %d (%s): by fd#%d\n",
task_pid_nr(p), p->comm, i - 1);
- force_sig(SIGKILL, p);
+ goto kill;
}
- task_unlock(p);
+ continue;
+kill:
+ send_sig_info(SIGKILL, SEND_SIG_FORCED, p);
} while_each_thread(g, p);
read_unlock(&tasklist_lock);
#endif

===================================================[PATCH 2]=======================================
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 0df71480762a..791f1d965d04 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2704,7 +2704,8 @@ void __do_SAK(struct tty_struct *tty)
#ifdef TTY_SOFT_SAK
tty_hangup(tty);
#else
- struct task_struct *g, *p;
+ struct task_struct *p, *t;
+ struct files_struct *files;
struct pid *session;
int i;

@@ -2727,7 +2728,7 @@ void __do_SAK(struct tty_struct *tty)
} while_each_pid_task(session, PIDTYPE_SID, p);

/* Now kill any processes that happen to have the tty open */
- do_each_thread(g, p) {
+ for_each_process(p) {
if (unlikely(is_global_init(p)))
continue;

@@ -2736,18 +2737,28 @@ void __do_SAK(struct tty_struct *tty)
task_pid_nr(p), p->comm);
goto kill;
}
- task_lock(p);
- i = iterate_fd(p->files, 0, this_tty, tty);
- task_unlock(p);
- if (i != 0) {
- tty_notice(tty, "SAK: killed process %d (%s): by fd#%d\n",
- task_pid_nr(p), p->comm, i - 1);
- goto kill;
+
+ files = NULL;
+ for_each_thread(p, t) {
+ if (t->files == files) /* racy but we do not care */
+ continue;
+
+ task_lock(t);
+ files = t->files;
+ i = iterate_fd(files, 0, this_tty, tty);
+ task_unlock(t);
+
+ if (i != 0) {
+ tty_notice(tty, "SAK: killed process %d (%s): by fd#%d\n",
+ task_pid_nr(t), t->comm, i - 1);
+ goto kill;
+ }
}
+
continue;
kill:
send_sig_info(SIGKILL, SEND_SIG_FORCED, p);
- } while_each_thread(g, p);
+ }
read_unlock(&tasklist_lock);
#endif
}