Re: [RFC PATCH v1 00/13] exec: add spawn templates for repeated executable startup

From: Li Chen

Date: Tue Jun 09 2026 - 11:02:57 EST

Hi Andy,

---- On Tue, 09 Jun 2026 08:01:57 +0800 Andy Lutomirski <luto@xxxxxxxxxx> wrote ---
> On Thu, May 28, 2026 at 4:05 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> >
> > On Thu, May 28, 2026 at 05:52:21PM +0800, Li Chen wrote:
> > > Hi,
> > >
> > > This is an early RFC for an idea that is probably still rough in both the
> > > UAPI and implementation details. Sorry for the rough edges; I am sending
> > > it now to check whether this direction is worth pursuing and to get
> > > feedback on the kernel/userspace boundary.
> >
> > The idea of having a builder api for exec isn't all that crazy. But it
> > should simply be built on top of pidfds and thus pidfs itself instead.
> > It has all the basic infrastructure in place already. Any implementation
> > should also allow userspace to implement posix_spawn() on top of it.
> >
> > fd = pidfd_open(0, PIDFD_EMPTY /* or better name */)
> >
> > pidfd_config(fd, ...) // modeled similar to fsconfig()
> >
>
> After contemplating this for a bit... why pidfd? Doesn't a pidfd
> refer to an actual process that is, or at least was, running? This
> new thing is a process that we are contemplating spawning. I can
> imagine that basically all pidfd APIs would be a bit confused by the
> nonexistence of the process in question.
>

Yes, I think that is a real concern.

In my current local WIP I tried to keep that distinction explicit.
pidfd_spawn_open() returns a pidfs-backed builder fd, not a normal pidfd
referring to a process. The builder fd is allocated as an anonymous pidfs
file with builder-specific file operations:

file = pidfs_alloc_anon_file("[pidfd_spawn]",
&pidfd_spawn_builder_fops, builder,
O_RDWR);

and the normal pidfd helpers still reject it because it does not use the
ordinary pidfd file operations:

struct pid *pidfd_pid(const struct file *file)
{
if (file->f_op != &pidfs_file_operations)
return ERR_PTR(-EBADF);
return file_inode(file)->i_private;
}

So the current split is:

builder_fd = pidfd_spawn_open(...); /* builder object */
pidfd_config(builder_fd, ...);
child_pidfd = pidfd_spawn_run(builder_fd, ...); /* real pidfd */

Only the last fd is a normal pidfd for an actual child process. The
builder fd is only accepted by the builder operations.

This avoids having to define what waitid(P_PIDFD), pidfd_send_signal(),
pidfd_getfd(), poll(), etc. mean before the process exists. The downside
is that it adds a separate open-style entry point and is less uniform than
the pidfd_open(0, PIDFD_EMPTY) spelling Christian sketched.

If people think there is a better way to represent the pre-spawn builder
state, or if the preference is to integrate it directly into pidfd_open()
with an explicit empty/future-pidfd state, I would be happy to discuss
that.

Regards,
Li