Re: [RFC PATCH 0/2] seccomp: defer syscall_rollback() to get_signal()

From: Oleg Nesterov

Date: Thu Apr 16 2026 - 10:07:49 EST


On 04/15, Kees Cook wrote:
>
> I've spent some more time looking at all this. It does seem to me that
> dropping syscall_exit_work() entirely for killed syscalls is the right way
> to go for fixing the audit/trace/ptrace confusion on the exit side.

OK, great. I'll try to make a patch as soon I have time. Hopefully this week.

> But
> I don't think it closes the whole problem.

I guess we can discuss this in more detail later/separately?

> Apologies for any verbosity
> here, I'm kind of taking notes for myself too. :)

Thanks for the detailed email ;)

I will snip some parts for now...

> I was trying to consider whether fixing this with a new ptrace event
> (PTRACE_EVENT_SECCOMP_KILL or a new PTRACE_SYSCALL_INFO op) would be
> better than reusing the existing signal-delivery stop (but perhaps in a
> "read-only" mode). My sense is that a new event isn't worth it,

Agreed,

> So I'm thinking the full fix is to change what SA_IMMUTABLE actually
> means: instead of "ptrace is disabled", it can be "the signal cannot
> be changed (i.e. cannot stop the kill)". Which means in get_signal()
> at the SA_IMMUTABLE check, stop gating ptrace_signal() on the flag and
> instead pass the flag into ptrace_signal() (or check in other places) so
> it can run in a "read-only" mode.

OK, we can add something like PT_FREEZED which leaves in task->ptrace,
but see below.

> I think refusing tracer actions would be best,

agreed

> - syscall_exit_work() skips the exit tracehook, audit, and trace
> when the syscall was RET_KILLed.

Good ;)

> - SA_IMMUTABLE stops disabling ptrace_signal() and starts gating
> mutations within it.

Honestly, I am not sure this is really useful... But I do not know.
And again, we can discuss this later I hope.

----------------------------------------------------------------------
Now a stupid question ;)

Why does __seccomp_filter() use syscall_rollback() anyway?

OK, may be ax == orig_ax makes sense for coredump, I dunno.

But
case SECCOMP_RET_TRAP:
/* Show the handler the original registers. */
syscall_rollback(current, current_pt_regs());
/* Let the filter pass back 16 bits of data. */
force_sig_seccomp(this_syscall, data, false);

the handler can just use info.si_syscall instead of sigcontext.rax ?

Oleg.