Re: [RFC PATCH 1/2] vfs: syscalls: add mkdirat_fd()
From: Aleksa Sarai
Date: Wed Apr 01 2026 - 22:54:24 EST
On 2026-04-01, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
> On Tue, Mar 31, 2026 at 07:19:58PM +0200, Jori Koolstra wrote:
> > @@ -5286,7 +5290,25 @@ int filename_mkdirat(int dfd, struct filename *name, umode_t mode)
> > lookup_flags |= LOOKUP_REVAL;
> > goto retry;
> > }
> > +
> > + if (!error && (flags & MKDIRAT_FD_NEED_FD)) {
> > + struct path new_path = { .mnt = path.mnt, .dentry = dentry };
> > + error = FD_ADD(0, dentry_open(&new_path, O_DIRECTORY, current_cred()));
> > + }
> > + end_creating_path(&path, dentry);
> > return error;
>
>
> You can't do it like this. Should it turn out no fd can be allocated,
> the entire thing is going to error out while keeping the newly created
> directory behind. You need to allocate the fd first, then do the hard
> work, and only then fd_install and or free the fd. The FD_ADD machinery
> can probably still be used provided proper wrapping of the real new
> mkdir.
>
> It should be perfectly feasible to de facto wrap existing mkdir
> functionality by this syscall.
>
> On top of that similarly to what other people mentioned the new syscall
> will definitely want to support O_CLOEXEC and probably other flags down
> the line.
>
> Trying to handle this in open() is a no-go. openat2 is rather
> problematic.
I'm interested in what makes you say that. It would be very nice to be able
to do mkdir + RESOLVE_IN_ROOT and get an fd back all in one syscall. :D
To be fair, build_open_how() will need some more magic to keep openat()
working, and that won't be particularly pretty. If we went with
O_CREAT|O_DIRECTORY we would need to be quite careful to make sure
O_TMPFILE continues to work for both openat() and openat2()...
> I tend to agree mkdirat_fd is not a good name for the syscall either,
> but I don't have a suggestion I'm happy with. I think least bad name
> would follow the existing stuff and be mkdirat2 or similar.
>
> The routine would have to start with validating the passed O_ flags, for
> now only allowing O_CLOEXEC and EINVAL-ing otherwise.
Please do not use O_* flags! O_CLOEXEC takes up 3 flag bits on different
architectures which makes adding new flags a nightmare.
I think this should take AT_* flags and (like most newer syscalls)
O_CLOEXEC should be automatically set. Userspace can unset it with
fnctl(F_SETFD) in the relatively rare case where they don't want
O_CLOEXEC. Alternatively, we could just bite the bullet and make
AT_NO_CLOEXEC a thing...
But yes, new syscalls *absolutely* need to take some kind of flag
argument. I'd hoped we finally learned our lesson on that one...
--
Aleksa Sarai
https://www.cyphar.com/
Attachment:
signature.asc
Description: PGP signature