Re: [RFC PATCH 1/2] vfs: syscalls: add mkdirat_fd()
From: H. Peter Anvin
Date: Tue Mar 31 2026 - 16:47:40 EST
On March 31, 2026 1:25:03 PM PDT, Yann Droneaud <yann@xxxxxxxxxxx> wrote:
>Hi,
>
>Le 31/03/2026 à 19:19, Jori Koolstra a écrit :
>> Currently there is no way to race-freely create and open a directory.
>> For regular files we have open(O_CREAT) for creating a new file inode,
>> and returning a pinning fd to it. The lack of such functionality for
>> directories means that when populating a directory tree there's always
>> a race involved: the inodes first need to be created, and then opened
>> to adjust their permissions/ownership/labels/timestamps/acls/xattrs/...,
>> but in the time window between the creation and the opening they might
>> be replaced by something else.
>>
>> Addressing this race without proper APIs is possible (by immediately
>> fstat()ing what was opened, to verify that it has the right inode type),
>> but difficult to get right. Hence, mkdirat_fd() that creates a directory
>> and returns an O_DIRECTORY fd is useful.
>>
>> This feature idea (and description) is taken from the UAPI group:
>> https://github.com/uapi-group/kernel-features?tab=readme-ov-file#race-free-creation-and-opening-of-non-file-inodes
>>
>> Signed-off-by: Jori Koolstra <jkoolstra@xxxxxxxxx>
>> ---
>> arch/x86/entry/syscalls/syscall_64.tbl | 1 +
>> fs/internal.h | 1 +
>> fs/namei.c | 26 ++++++++++++++++++++++++--
>> include/linux/fcntl.h | 2 ++
>> include/linux/syscalls.h | 2 ++
>> include/uapi/asm-generic/fcntl.h | 3 +++
>> include/uapi/asm-generic/unistd.h | 5 ++++-
>> scripts/syscall.tbl | 1 +
>> 8 files changed, 38 insertions(+), 3 deletions(-)
>
>> diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h
>> index a332e79b3207..d2f0fdb82847 100644
>> --- a/include/linux/fcntl.h
>> +++ b/include/linux/fcntl.h
>> @@ -25,6 +25,8 @@
>> #define force_o_largefile() (!IS_ENABLED(CONFIG_ARCH_32BIT_OFF_T))
>> #endif
>> +#define VALID_MKDIRAT_FD_FLAGS (MKDIRAT_FD_NEED_FD)
>> +
>
>I don't see support for O_CLOEXEC-ish flag, is the file descriptor in close-on-exec mode by default ? If yes, it should be mentioned.
>
>
>> diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
>> index 613475285643..621458bf1fbf 100644
>> --- a/include/uapi/asm-generic/fcntl.h
>> +++ b/include/uapi/asm-generic/fcntl.h
>> @@ -95,6 +95,9 @@
>> #define O_NDELAY O_NONBLOCK
>> #endif
>> +/* Flags for mkdirat_fd */
>> +#define MKDIRAT_FD_NEED_FD 0x01
>> +
>
>
>Regards.
>
>
And even if it is, POSIX already has O_CLOFORK and we should expect that that will be needed, too.