Re: [RFC PATCH 1/2] vfs: syscalls: add mkdirat_fd()
From: David Laight
Date: Wed Apr 01 2026 - 10:20:03 EST
On Tue, 31 Mar 2026 21:13:34 +0200
"Arnd Bergmann" <arnd@xxxxxxxx> wrote:
> On Tue, Mar 31, 2026, at 19:19, Jori Koolstra wrote:
> > Currently there is no way to race-freely create and open a directory.
> > For regular files we have open(O_CREAT) for creating a new file inode,
> > and returning a pinning fd to it. The lack of such functionality for
> > directories means that when populating a directory tree there's always
> > a race involved: the inodes first need to be created, and then opened
> > to adjust their permissions/ownership/labels/timestamps/acls/xattrs/...,
> > but in the time window between the creation and the opening they might
> > be replaced by something else.
> >
> > Addressing this race without proper APIs is possible (by immediately
> > fstat()ing what was opened, to verify that it has the right inode type),
> > but difficult to get right. Hence, mkdirat_fd() that creates a directory
> > and returns an O_DIRECTORY fd is useful.
> >
> > This feature idea (and description) is taken from the UAPI group:
> > https://github.com/uapi-group/kernel-features?tab=readme-ov-file#race-free-creation-and-opening-of-non-file-inodes
> >
> > Signed-off-by: Jori Koolstra <jkoolstra@xxxxxxxxx>
>
> I checked that the calling conventions are fine, i.e. this will work
> as expected across all architectures. I assume you are also aware
> that the non-RFC patch will need to add the syscall number to all
> .tbl files.
>
> The hardest problem here does seem to be the naming of the
> new syscall, and I'm sorry to not be able to offer any solution
> either, just two observations:
>
> - mkdirat/mkdirat_fd sounds similar to the existing
> quotactl/quotactl_fd pair, but quotactl_fd() takes a file
> descriptor argument rather than returning it, which makes
> this addition quite confusing.
>
> - the nicest interface IMO would have been a variation of
> openat(dfd, filename, O_CREAT | O_DIRECTORY, mode)
> but that is a minefield of incompatible implementations[1],
> so we can't do that without changing the behavior for
> existing callers that currently run into an error.
Just require O_TMPFILE to be set as well :-)
You know you'll never regret it one Apr-1 is over.
Can something be done with the flags to openat2().
That might save allocating an extra system call.
David
>
> Arnd
>
> [1] https://lwn.net/Articles/926782/
>