Re: [PATCH][RFC] fix PTRACE_KILL

From: Eric W. Biederman
Date: Tue Aug 17 2021 - 14:21:41 EST


Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:

> [Cc'd to security@k.o, *NOT* because I consider it a serious security hole;
> it's just that the odds of catching relevant reviewers there are higher
> than on l-k and there doesn't seem to be any lists where that would be
> on-topic. My apologies for misuse of security@k.o ;-/]

Hmm. I don't see security@xxxxxxxxxx Cc'd.

> Current implementation is racy in quite a few ways - we check that
> the child is traced by us and use ptrace_resume() to feed it
> SIGKILL, provided that it's still alive.
>
> What we do not do is making sure that the victim is in ptrace stop;
> as the result, it can go and violate all kinds of assumptions,
> starting with "child->sighand won't change under ptrace_resume()",
> "child->ptrace won't get changed under user_disable_single_step()",
> etc.
>
> Note that ptrace(2) manpage has this to say:
>
> PTRACE_KILL
> Send the tracee a SIGKILL to terminate it. (addr and data are
> ignored.)
>
> This operation is deprecated; do not use it! Instead, send a
> SIGKILL directly using kill(2) or tgkill(2). The problem with
> PTRACE_KILL is that it requires the tracee to be in signal-
> delivery-stop, otherwise it may not work (i.e., may complete
> successfully but won't kill the tracee). By contrast, sending a
> SIGKILL directly has no such limitation.
>
> So let it check (under tasklist_lock) that the victim is traced by us
> and call sig_send_info() to feed it SIGKILL. It's easier that trying
> to force ptrace_resume() into handling that mess and it's less brittle
> that way.

I took a quick look and despite being deprecated PTRACE_KILL appears
to still have some active users (like gcc-10). So that seems to rule
out just removing PTRACE_KILL.

I looked at the bug that PTRACE_KILL only kills a process when it is
stopped and it is present in Linux 1.0. Given that I expect userspace
applications are ok with the current semantics rather than the intended
semantics.

The current semantics also include the weirdness that PTRACE_KILL only
kills a process when it is stopped in ptrace_signal, and not at other
ptrace stops.

So rather than fix the code to do what was intended 27 years ago,
why don't we accept the fact that PTRACE_KILL is equivalent
to PTRACE_CONT with data = SIGKILL.

If there are regressions or we really care we can tweak the return value
to return 0 instead of -ESRCH when the process is not stopped.

Something like this:

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index f8589bf8d7dc..f40f0a0ff70a 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1221,8 +1221,6 @@ int ptrace_request(struct task_struct *child, long request,
return ptrace_resume(child, request, data);

case PTRACE_KILL:
- if (child->exit_state) /* already dead */
- return 0;
return ptrace_resume(child, request, SIGKILL);

#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
@@ -1304,8 +1302,7 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr,
goto out_put_task_struct;
}

- ret = ptrace_check_attach(child, request == PTRACE_KILL ||
- request == PTRACE_INTERRUPT);
+ ret = ptrace_check_attach(child, request == PTRACE_INTERRUPT);
if (ret < 0)
goto out_put_task_struct;

@@ -1449,8 +1446,7 @@ COMPAT_SYSCALL_DEFINE4(ptrace, compat_long_t, request, compat_long_t, pid,
goto out_put_task_struct;
}

- ret = ptrace_check_attach(child, request == PTRACE_KILL ||
- request == PTRACE_INTERRUPT);
+ ret = ptrace_check_attach(child, request == PTRACE_INTERRUPT);
if (!ret) {
ret = compat_arch_ptrace(child, request, addr, data);
if (ret || request != PTRACE_DETACH)

Eric