Re: seccomp: dump core when using SECCOMP_RET_KILL

From: Kees Cook
Date: Tue Jan 31 2017 - 19:19:01 EST

On Tue, Jan 31, 2017 at 1:59 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Fri, Jan 27, 2017 at 1:48 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>> On Wed, Jan 25, 2017 at 12:05 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>> On Tue, Jan 24, 2017 at 4:53 PM, Andrei Vagin <avagin@xxxxxxxxxxxxx> wrote:
>>>> Hi,
>>>> One of CRIU tests fails with this patch:
>>>> Before this patch only a thread which called a "wrong" syscall is killed.
>>>> Now a whole process is killed if one of threads called a "wrong" syscall.
>>> Oh ew. I wonder what is causing this? In other do_coredump() callers,
>>> they explicitly call do_group_exit(). Hmmm
>> We need to find a way to fix this or remove the coredump change from
>> -next. We have a few things that have come up recently (coincident
>> with the coredump change):
>> - some folks would like seccomp kills to kill the entire process not
>> just the thread
>> - on a full-process kill, there needs to be a way to get a coredump
>> - on a kill, it would be nice to have reliable logging
>> Getting a coredump requires a full-process kill. It is possible to do
>> this already with RET_TRAP and just not catch the SIGSYS. However,
>> this isn't sufficient if you want to be _sure_ the entire process gets
>> killed since RET_TRAP depends on cooperation from the process.
>> Getting reliable logging out of seccomp for non-RET_KILL is
>> non-trivial because syscall-audit doesn't track forks.
>> The RET_* values are part of the UAPI, so changing or adding to them
>> requires care.
>> Right now we have very little room in the RET_* values (the lower
>> bytes are for the RET_DATA which is ignored for RET_KILL and the
>> semantics of changing that is very difficult):
>> #define SECCOMP_RET_KILL 0x00000000U /* kill the task immediately */
>> #define SECCOMP_RET_TRAP 0x00030000U /* disallow and force a SIGSYS */
>> Killing the entire process is more aggressive than RET_KILL currently,
>> so the question becomes, should we upgrade RET_KILL to
>> RET_KILL_PROCESS and add RET_KILL_THREAD? Are there people that WANT
>> only a thread to be killed? Andrei, does CRIU depend on this behavior,
>> or is it "just" a regression test detail?
> I think we actually have more flexibility: we have all the low bits --
> they're currently ignored. We could plausibly have one code for "kill
> this thread, no dump", one code for "kill this process, dump core",
> and maybe one code for "kill this process, don't dump core".
> I would be a bit surprised if anyone uses codes between 0x1 and 0xffff
> inclusive, so changing the semantics of those codes ought to be safe.
> (Or we could gate them behind a flag.) I'm not sure how I feel about
> changing the semantics of 0.

The most recently installed filter controls the value of the low bits,
so this isn't quite useful for this case, I don't think.


Kees Cook
Pixel Security