Re: [PATCH] seccomp: plug syscall-dodging ptrace hole
From: Kees Cook
Date: Thu May 26 2016 - 22:41:13 EST
On Thu, May 26, 2016 at 7:10 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Thu, May 26, 2016 at 2:04 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> One problem with seccomp was that ptrace could be used to change a
>> syscall after seccomp filtering had completed. This was a well documented
>> limitation, and it was recommended to block ptrace when defining a filter
>> to avoid this problem. This can be quite a limitation for containers or
>> other places where ptrace is desired even under seccomp filters.
>>
>> Since seccomp filtering has been split into pre-trace and trace phases
>> (phase1 and phase2 respectively), it's possible to re-run phase1 seccomp
>> after ptrace. This makes that change, and updates the test suite for
>> both SECCOMP_RET_TRACE and PTRACE_SYSCALL manipulation.
>
> I like fixing the hole, but I don't like this fix.
>
> The two-phase seccomp mechanism is messy. I wrote it because it was a
> huge speedup. Since then, I've made a ton of changes to the way that
> x86 syscalls work, and there are two relevant effects: the slow path
> is quite fast, and the phase-1-only path isn't really a win any more.
>
> I suggest that we fix the by simplifying the code instead of making it
> even more complicated. Let's back out the two-phase mechanism (but
> keep the ability for arch code to supply seccomp_data) and then just
> reorder it so that seccomp happens after ptrace. The result should be
> considerably simpler. (We'll still have to answer the question of
> what happens when a SECCOMP_RET_TRACE event changes the syscall, but
> maybe the answer is to just let it through -- after all,
> SECCOMP_RET_TRACE might be a request by a tracer to do its own
> internal filtering.)
I'm really against this. I think seccomp needs to stay first, and I
like the two-phase split because it gives us a lot of flexibility on
other architectures. And we can't just let through RET_TRACE because
we'll have exactly the same problem: a process can add a RET_TRACE
filter for some syscall and then change it arbitrarily to escape the
filtering. The non-trace returns of seccomp need to be check first and
after ptrace manipulations. The patch seems like the best approach and
it covers all the corners.
-Kees
--
Kees Cook
Chrome OS & Brillo Security