Re: [RFC PATCH 1/2] vfs: syscalls: add mkdirat_fd()

From: Mateusz Guzik

Date: Tue Apr 07 2026 - 05:03:00 EST


On Wed, Apr 1, 2026 at 12:25 PM Jori Koolstra <jkoolstra@xxxxxxxxx> wrote:
>
>
> > Op 01-04-2026 06:19 CEST schreef Mateusz Guzik <mjguzik@xxxxxxxxx>:
> >
> >
> > On Tue, Mar 31, 2026 at 07:19:58PM +0200, Jori Koolstra wrote:
> > > @@ -5286,7 +5290,25 @@ int filename_mkdirat(int dfd, struct filename *name, umode_t mode)
> > > lookup_flags |= LOOKUP_REVAL;
> > > goto retry;
> > > }
> > > +
> > > + if (!error && (flags & MKDIRAT_FD_NEED_FD)) {
> > > + struct path new_path = { .mnt = path.mnt, .dentry = dentry };
> > > + error = FD_ADD(0, dentry_open(&new_path, O_DIRECTORY, current_cred()));
> > > + }
> > > + end_creating_path(&path, dentry);
> > > return error;
> >
> >
> > You can't do it like this. Should it turn out no fd can be allocated,
> > the entire thing is going to error out while keeping the newly created
> > directory behind. You need to allocate the fd first, then do the hard
> > work, and only then fd_install and or free the fd. The FD_ADD machinery
> > can probably still be used provided proper wrapping of the real new
> > mkdir.
>
> But isn't this exactly what happens in open(O_CREAT) too? Eventually we
> call
> error = dir_inode->i_op->create(idmap, dir_inode, dentry,
> mode, open_flag & O_EXCL);
>
> and only then do we assign and install the fd. AFAIK there is no cleanup
> happening there either if the FD_ADD step fails. You will just have a
> regular file and no descriptor. But I would have to test this to be sure.
>

FD_ADD(how->flags, do_file_open(dfd, name, &op)) means fd itself will
be allocated upfront and only then file creation will happen and which
is what I'm saying is how it should be done. With your patch the
directory is created first and the possibly failing fd allocation
happens later.

> >
> > On top of that similarly to what other people mentioned the new syscall
> > will definitely want to support O_CLOEXEC and probably other flags down
> > the line.
> >
>
> I agree, and perhaps O_PATH too. Maybe just all open flags relevant to
> directories?
>

I don't know about O_PATH as is, but certainly the syscall needs to be
able to grab more flags in the future.

> > Trying to handle this in open() is a no-go. openat2 is rather
> > problematic.
>
> I don't think that is necessarily true. It turned out O_CREAT | O_DIRECTORY
> was bugged for a very long time. Christian Brauner fixed it eventually, and
> that combination now returns EINVAL. But I think there is nothing really
> stopping us from implementing that combination in the expected way, apart
> from whatever reasons there were for not allowing this in the first place,
> which I don't know about (maybe mixing semantics?)
>

I am not saying it's impossible. I am saying mkdir was always a
separate codepath and in order to change that you would need to add a
branchfest to open. I don't see any reason to go that route.