Re: [RFC PATCH] ptrace: don't report syscall-exit if the tracee was killed by seccomp

From: Oleg Nesterov

Date: Sun Mar 22 2026 - 11:14:23 EST


On 03/22, Kees Cook wrote:
>
> On March 22, 2026 6:44:54 AM PDT, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >__seccomp_filter() does
> >
> > case SECCOMP_RET_KILL_THREAD:
> > case SECCOMP_RET_KILL_PROCESS:
> > ...
> > /* Show the original registers in the dump. */
> > syscall_rollback(current, current_pt_regs());
> >
> > /* Trigger a coredump with SIGSYS */
> > force_sig_seccomp(this_syscall, data, true);
> >
> >syscall_rollback() does regs->ax == orig_ax. This means that
> >ptrace_get_syscall_info_exit() will see .is_error == 0. To the tracer,
> >it looks as if the aborted syscall actually succeeded and returned its
> >own syscall number.
> >
> >And since force_sig_seccomp() uses force_coredump == true, SIGSYS won't
> >be reported (see the SA_IMMUTABLE check in get_signal()), so the tracee
> >will "silently" exit with error_code == SIGSYS after the bogus report.
> >
> >Change syscall_exit_work() to avoid the bogus single-step/syscall-exit
> >reports if the tracee is SECCOMP_MODE_DEAD.
> >
> >TODO: With or without this change, get_signal() -> ptrace_signal() may
> >report other !SA_IMMUTABLE pending signals before it dequeues SIGSYS.
> >Perhaps it makes sense to change get_signal() to check SECCOMP_MODE_DEAD
> >too and prioritize the fatal SIGSYS.
> >
> >Reported-by: Max Ver <dudududumaxver@xxxxxxxxx>
> >Closes: https://lore.kernel.org/all/CABjJbFJO+p3jA1r0gjUZrCepQb1Fab3kqxYhc_PSfoqo21ypeQ@xxxxxxxxxxxxxx/
> >Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
> >---
> > include/linux/entry-common.h | 3 +++
> > include/linux/seccomp.h | 8 ++++++++
> > kernel/seccomp.c | 3 ---
> > 3 files changed, 11 insertions(+), 3 deletions(-)
> >
> >diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
> >index f83ca0abf2cd..5c62bda9dcf9 100644
> >--- a/include/linux/entry-common.h
> >+++ b/include/linux/entry-common.h
> >@@ -250,6 +250,9 @@ static __always_inline void syscall_exit_work(struct pt_regs *regs, unsigned lon
> > if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT)
> > trace_syscall_exit(regs, syscall_get_return_value(current, regs));
> >
> >+ if (killed_by_seccomp(current))
> >+ return;
>
> Hmm. I'm still not convinced this is right,

Me too actually ;)

That is why RFC. So:

- Do you agree that the current behaviour is not really "sane" and
can confuse ptracers?

- If yes, what else do you think we can do? No, I no longer think it
makes sense to change the ptrace_get_syscall_info_exit() paths...


> but if we make this change, I'd want to see a behavioral test added
> (likely to the seccomp self tests), and to make sure the rr test suite doesn't regress.

OK. I'll try to take a look at these tests and possibly add another one.

But (sorry) not the next week, I will be travelling.

Oleg.