Re: [RFC PATCH v3 2/3] seccomp: add kernel-installed pinned-memfd redirect

From: Andy Lutomirski

Date: Wed Jun 24 2026 - 16:11:37 EST


On Tue, Jun 23, 2026 at 8:39 PM Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
>
> On Tue, Jun 23, 2026 at 3:21 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> > The implementation of the patch is a bit sad -- there is no technical
> > reason that the code couldn't actually issue the syscall and avoid all
> > the heap and task-work crud. The kernel isn't really structured like
> > that, but I don't think it would be a big departure.
>
> You are right. I have redesigned it a bit again.
>
> SEND_REDIRECT is replaced with SECCOMP_IOCTL_NOTIF_RUN:
> instead of rewriting the trapped task's argument registers and resuming
> the same syscall, the supervisor names an explicit {nr, args}, and seccomp
> issues *that* syscall in the target's context, reports its return value as the
> trapped syscall's result, and skips the original. It runs in the target
> (its mm/creds/fds), just dispatched into a scratch pt_regs frame so the
> task's real registers are never touched.

What is a "scratch pt_regs frame"? What happens to any code in the
kernel that touches current_pt_regs()?

Even if you tried to make a list of syscalls that don't inherently
mess with current_pt_regs() or other task state that might be
relevant, what happens if there's a tracer that expects
syscall_get_args and such to work?

>
> Because nothing is rewritten, there is nothing to restore, the
> task_work/heap fixup is deleted, and with it the entire signal corner
> cases.

Really? What happens to -ERESTARTxyz?

>
> Please let me know if this looks better for you too.
>
> Regards,
> Cong



--
Andy Lutomirski
AMA Capital Management, LLC