Re: [RFC PATCH v5 1/2] vfs: add O_CREAT|O_DIRECTORY to open*(2)

From: Jori Koolstra

Date: Wed Jun 03 2026 - 09:25:17 EST



> Op 02-06-2026 17:44 CEST schreef Christian Brauner <brauner@xxxxxxxxxx>:
>
> Yes, I agree. This would change error codes but I don't think it
> matters:
>
> * O_WRONLY | O_DIRECTORY on non-directory -> ENOTDIR
> * O_WRONLY | O_DIRECTORY on directory -> EISDIR
>
> I don't think that really matters and we should be able to collapse this
> to ENOTDIR.

I will pick this up in the next version of O_CREAT|O_DIRECTORY. I think that
makes most sense.

I have an outstanding patch for changing the EACCES/EPERM to EOPNOTSUPP;
Jeff and Jan were skeptical, but I want to know your opinion as well.
I feel the the scenario where userspace has no fall-through but does
handle every single -E listed in the man-page quite unlikely, so I say
lets change them and we'll hear from them if somehow someone relied on
this weird way of error handling.

> > >
> > > There is another point, I maybe should have mentioned in the cover letter: I have not attempted
> > > to handle dangling symlinks for O_MKDIR. Not because I think they are a great idea (as Aleksa
> > > has mentioned, but I am not very familiar with the dragons it entails), but I wanted to discuss
> > > what behavior we want in this case. Do we say that we never do a mkdir after following a lookup
> > > last symlink? I don't think that state is even recorded right now.
> >
> > I think the state might be recorded in nd->depth. But you probably
> > don't want to use that directly. Maybe forcing LOOKUP_FOLLOW to be
> > cleared if O_CREAT|O_DIRECTORY is set would be good. But what would
> > stop you opening an existing directory through a symlink....
> >
> > Probably we need a clear statement of intended semantics which we can
> > review, agree on, then implement. Have you looked at preparing a patch
> > for man-pages to document the change in behaviour for openat etc?
>
> Ugh, dangling symlinks. Actually, scratch that: Ugh, symlinks. So
> O_CREAT without O_NOFOLLOW allows you to create the target of a dangling
> symlink iirc. I always forget that. I think this is a very subtle bug
> and maybe - with both eyes closed - a feature at times.
>
> We should straighten the behavior for O_DIRECTORY | O_CREAT and we
> agreed on that during LSFMM. It would be nice if we could get away with
> simply implying O_NOFOLLOW but I think you're right, Neil, that this
> prevents a valid O_CREAT | O_DIRECTORY on an existing directory which we
> can't do. Makes this kind of a pointless excercise.
>
> But this shouldn't be all that crazy to do right. Using the O_CREAT as
> an _example_ for what we'd need:
>
> fs: refuse O_CREAT through a dangling symlink
>
> open(O_CREAT) without O_EXCL follows a trailing symlink and, when the
> symlink target does not exist, creates it. Refuse to create through a
> dangling symlink instead.
>
> In lookup_open() a negative target reached with nd->depth > 0 was
> arrived at by following a trailing symlink; since the dentry is negative
> the symlink is dangling. Set create_error to -ELOOP in that case.
> Reusing the existing create_error path strips O_CREAT for both the
> generic and ->atomic_open create paths and only reports the error when
> the target is actually negative, so opening an existing target through a
> symlink, interior symlinks, and O_EXCL (which never follows the trailing
> link) are all unaffected.
>
> Hastily-Cobbled-Together-by: Christian Brauner (Amutable) <brauner@xxxxxxxxxx>
>
> diff --git a/fs/namei.c b/fs/namei.c
> index c7fac83c9a85..d20bbcc7e8d3 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4468,6 +4468,9 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file,
> dentry, mode);
> else
> create_error = -EROFS;
> + /* refuse to create through a dangling (trailing) symlink */
> + if (unlikely(nd->depth) && !create_error)
> + create_error = -ELOOP;
> }
> if (create_error)
> open_flag &= ~O_CREAT;
>
> It can't be that easy...

This is what I suggested above, correct, in terms of behavior?

In terms of the patch, I think this will work, but struct nameidata could really
use some commentary for its fields. I spent the last two hours verifying that
nd->depth really does what I thought it did, and I am still not 100% positive.
AFAIS, nd->depth indeed tracks the current symlink depth, which outside of
link_path_walk() reduces to the number of trailing links followed.

But if Neil's rework of lookup_open() is merged we lose access here to nd.
@Neil, have you thought about what would be a good way to resolve that?