Re: [RFC PATCH] Implement /proc/pid/kill
From: Daniel Colascione
Date: Wed Oct 31 2018 - 09:27:51 EST
On Wed, Oct 31, 2018 at 12:44 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 10/30, Eric W. Biederman wrote:
>>
>> At a bare minimum you need to perform the permission check using the
>> credentials of the opener of the file. Which means refactoring
>> kill_pid so that you can perform the permission check for killing the
>> application during open.
Absolutely right. Thanks for spotting that.
> perhaps it would be simpler to do
>
> my_cred = override_creds(file->f_cred);
> kill_pid(...);
> revert_creds(my_cred);
Thanks for the suggestion. That looks neat, but it's not quite enough.
The problem is that check_kill_permission looks for
same_thread_group(current, t) _before_ checking kill_of_by_cred, so
with just this code snippet, it'd still be possible for an
unprivileged process to open /proc/$PRIVILEGED_PID/kill and hand the
FD to $PRIVILEGED_PID, which would write to it and kill itself or any
of its threads. I think, with some rearrangement of permissions
checks, this problem can be overcome.
There's another problem though: say we open /proc/pid/5/kill *, with
proc 5 being an ordinary unprivileged process, e.g., the shell. At
open(2) time, the access check passes. Now suppose PID 5 execve(2)s
into a setuid process. The kill FD is still open, so the kill FD's
holder can send a signal to a process it normally wouldn't be able to
kill.
You might say, "let's somehow invalidate open kill FDs upon setuid
exec", but the problem that then results is then that a legitimate
privileged user of /proc/pid/kill (say, Android lmkd) might see its
/proc/pid/kill handle spuriously become invalidated if the process to
which it refers execs in a setuid way. Maybe in this case we make
could write(2) on the kill FD fail with ECHILD ("no child process"?)
and have callers, if they see ECHILD, close the kill FD, re-open it,
and try again. But now we're getting into an interface that's too
complicated to use from the shell.
Maybe a simpler approach would be to bind the kill FD to the struct
cred that opened it and keep the access check in write(2) --- a
write(2) with current->cred not equal to f_cred would fail with EPERM.
This way, you could play standard-output-of-setuid-program or
SCM_RIGHTS games with the kill FD itself, but you wouldn't be able to
do anything with the FD having done so.
Honestly, though, maybe a new procfs_sigqueue(2) system call would be
simpler and more robust. With a single system call, we wouldn't split
the permissions check between open(2) and write(2), and so the whole
problem disappears.
The downside is that you wouldn't be able to use the new feature via
the shell without a helper program. :-(
What do you think?
* I actually have a local variant of the patch that would have you
open "/proc/$PID/kill/$SIGNO" instead, since different signal numbers
have different permission checks.
This approach is kind of neat, since /proc/pid/kill/$SIGNO would act
as an "option" to kill a process only with a particular signal, and a
write(2) to a /proc/$PID/kill/$SIGNO file would allow you to specify a
sigqueue(2)-style siginfo value along with the actual signal number
(since the signal number is encoded in the filename). For example, a
privileged process could open /proc/self/kill/10 (SIGUSR1) and hand
the FD to an unprivileged process, letting that process signal (via
signal) completion of some process without giving that unprivileged
process the ability to send *any* signal to the privileged process.
But eventfd is almost certainly a better choice in this situation
anyway, I think.