Re: [RFC PATCH 0/2] seccomp: defer syscall_rollback() to get_signal()
From: Oleg Nesterov
Date: Thu Apr 16 2026 - 10:07:49 EST
On 04/15, Kees Cook wrote:
>
> I've spent some more time looking at all this. It does seem to me that
> dropping syscall_exit_work() entirely for killed syscalls is the right way
> to go for fixing the audit/trace/ptrace confusion on the exit side.
OK, great. I'll try to make a patch as soon I have time. Hopefully this week.
> But
> I don't think it closes the whole problem.
I guess we can discuss this in more detail later/separately?
> Apologies for any verbosity
> here, I'm kind of taking notes for myself too. :)
Thanks for the detailed email ;)
I will snip some parts for now...
> I was trying to consider whether fixing this with a new ptrace event
> (PTRACE_EVENT_SECCOMP_KILL or a new PTRACE_SYSCALL_INFO op) would be
> better than reusing the existing signal-delivery stop (but perhaps in a
> "read-only" mode). My sense is that a new event isn't worth it,
Agreed,
> So I'm thinking the full fix is to change what SA_IMMUTABLE actually
> means: instead of "ptrace is disabled", it can be "the signal cannot
> be changed (i.e. cannot stop the kill)". Which means in get_signal()
> at the SA_IMMUTABLE check, stop gating ptrace_signal() on the flag and
> instead pass the flag into ptrace_signal() (or check in other places) so
> it can run in a "read-only" mode.
OK, we can add something like PT_FREEZED which leaves in task->ptrace,
but see below.
> I think refusing tracer actions would be best,
agreed
> - syscall_exit_work() skips the exit tracehook, audit, and trace
> when the syscall was RET_KILLed.
Good ;)
> - SA_IMMUTABLE stops disabling ptrace_signal() and starts gating
> mutations within it.
Honestly, I am not sure this is really useful... But I do not know.
And again, we can discuss this later I hope.
----------------------------------------------------------------------
Now a stupid question ;)
Why does __seccomp_filter() use syscall_rollback() anyway?
OK, may be ax == orig_ax makes sense for coredump, I dunno.
But
case SECCOMP_RET_TRAP:
/* Show the handler the original registers. */
syscall_rollback(current, current_pt_regs());
/* Let the filter pass back 16 bits of data. */
force_sig_seccomp(this_syscall, data, false);
the handler can just use info.si_syscall instead of sigcontext.rax ?
Oleg.