Re: seccomp vs ptrace

From: Andy Lutomirski
Date: Wed Mar 18 2015 - 18:06:42 EST


On Wed, Mar 18, 2015 at 2:44 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> On Wed, Mar 18, 2015 at 2:42 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> On Wed, Mar 18, 2015 at 2:38 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>> On Wed, Mar 18, 2015 at 2:30 PM, Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
>>>> Hi,
>>>>
>>>> I'm writing to ask about
>>>>
>>>> The seccomp check will not be run again after the tracer is
>>>> notified. (This means that seccomp-based sandboxes MUST NOT
>>>> allow use of ptrace, even of other sandboxed processes, without
>>>> extreme care; ptracers can use this mechanism to escape.)
>>>>
>>>> This basically means that seccomp cannot be safely used with for instance
>>>> an upstart based container. I've been told that Andy was working on
>>>> changing the order so that ptrace checks would be done before seccomp.
>>>> Is there any update on that? Is it likely to happen? Scrapped?
>>>
>>> There are two problems, as I see it:
>>>
>>> 1) seccomp filtering happens first, so any following ptrace actions
>>> could change the syscall that actually happens (e.g. a filter allows
>>> clone and ptrace, meaning it could start a child, ptrace it, issue an
>>> allowed syscall, catch it, and change it to a disallowed syscall:
>>> escape from sandbox).
>>>
>>> 2) even if ptrace was moved ahead of seccomp, a sandboxed process as
>>> above and also access to add more filters (via seccomp or prctl
>>> syscalls) could use SECCOMP_RET_TRACE, to catch the syscall at the end
>>> of the seccomp checks, which would allow the same as above.
>>
>> Ouch!
>>
>> Arguably we messed up by making SECCOMP_RET_TRACE have higher
>> precedence than ERRNO and TRAP. We could add new ERRNO and TRAP
>> actions that have high precedence or a new flag that promotes them in
>> the filter being applied.
>
> Nope, RET_TRACE is lower. KILL, TRAP, ERRNO, TRACE, ALLOW. Still
> doesn't help the above cases, but we can't override a blocked syscall
> just with a new filter. You'd still have to do the ptrace dance with
> an allowed syscall.

Oh, right, I read it backwards.

We could try to handle TRACE immediately instead of after running all
filters. This could be rather tricky given the way the x86 code
works, though. Maybe at some point we'll be able to change that
without killing performance.

We could add an ugly flag that says that subsequent filters can use
TRACE, I suppose. Yuck.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/