Re: [RFC PATCH 1/2] vfs: syscalls: add mkdirat_fd()
From: Jori Koolstra
Date: Wed Apr 01 2026 - 06:35:14 EST
> Op 01-04-2026 06:19 CEST schreef Mateusz Guzik <mjguzik@xxxxxxxxx>:
>
>
> On Tue, Mar 31, 2026 at 07:19:58PM +0200, Jori Koolstra wrote:
> > @@ -5286,7 +5290,25 @@ int filename_mkdirat(int dfd, struct filename *name, umode_t mode)
> > lookup_flags |= LOOKUP_REVAL;
> > goto retry;
> > }
> > +
> > + if (!error && (flags & MKDIRAT_FD_NEED_FD)) {
> > + struct path new_path = { .mnt = path.mnt, .dentry = dentry };
> > + error = FD_ADD(0, dentry_open(&new_path, O_DIRECTORY, current_cred()));
> > + }
> > + end_creating_path(&path, dentry);
> > return error;
>
>
> You can't do it like this. Should it turn out no fd can be allocated,
> the entire thing is going to error out while keeping the newly created
> directory behind. You need to allocate the fd first, then do the hard
> work, and only then fd_install and or free the fd. The FD_ADD machinery
> can probably still be used provided proper wrapping of the real new
> mkdir.
But isn't this exactly what happens in open(O_CREAT) too? Eventually we
call
error = dir_inode->i_op->create(idmap, dir_inode, dentry,
mode, open_flag & O_EXCL);
and only then do we assign and install the fd. AFAIK there is no cleanup
happening there either if the FD_ADD step fails. You will just have a
regular file and no descriptor. But I would have to test this to be sure.
>
> On top of that similarly to what other people mentioned the new syscall
> will definitely want to support O_CLOEXEC and probably other flags down
> the line.
>
I agree, and perhaps O_PATH too. Maybe just all open flags relevant to
directories?
> Trying to handle this in open() is a no-go. openat2 is rather
> problematic.
I don't think that is necessarily true. It turned out O_CREAT | O_DIRECTORY
was bugged for a very long time. Christian Brauner fixed it eventually, and
that combination now returns EINVAL. But I think there is nothing really
stopping us from implementing that combination in the expected way, apart
from whatever reasons there were for not allowing this in the first place,
which I don't know about (maybe mixing semantics?)
>
> I tend to agree mkdirat_fd is not a good name for the syscall either,
> but I don't have a suggestion I'm happy with. I think least bad name
> would follow the existing stuff and be mkdirat2 or similar.
>
> The routine would have to start with validating the passed O_ flags, for
> now only allowing O_CLOEXEC and EINVAL-ing otherwise.
Thanks,
Jori