Re: [PATCH v3 03/19] VFS: allow d_alloc_name() to be used with ->d_hash
From: NeilBrown
Date: Tue Apr 28 2026 - 22:45:38 EST
On Tue, 28 Apr 2026, Al Viro wrote:
> On Mon, Apr 27, 2026 at 02:01:21PM +1000, NeilBrown wrote:
>
> > +/**
> > + * d_alloc_name: allocate a dentry for use in a dcache-based filesystem.
> > + * @parent: dentry of the parent for the dentry
> > + * @name: name of the dentry
> > + *
> > + * d_alloc_name() allocates a dentry without any protection against
> > + * races. It should only be used in directories that do not support
> > + * create/rename/link inode operations and so is particularly suited for
>
> Contemplate
>
> const struct inode_operations efivarfs_dir_inode_operations = {
> .lookup = simple_lookup,
> .unlink = efivarfs_unlink,
> .create = efivarfs_create,
> };
>
> in fs/efivarfs/inode.c, please.
>
> The only reason efivarfs is still playing with d_alloc() rather than
> using simple_start_creating()/simple_done_creating() (and believe me,
> I had been very tempted to kill that weirdness) is that exclusion in
> there is deeply weird.
>
> It *has* ->create() (and ->unlink(), while we are at it). Both are
> using efivar_lock(). And efivarfs_create_dentry() is called by
> their iterator callbacks - called under efivar_lock(). So we can't
> use anything like the regular exclusion in that case or we would
> deadlock... or, at least, scare the living hell out of lockdep.
>
> In reality there is an exclusion of very irregular sort - and it's
> not entirely correct, at that. It's based upon the fs freeze levels,
> of all things - calls of ->create() and ->unlink() can't overlap with
> calls of ->unfreeze_fs(), which is where that shite comes from after
> the filesystem had been mounted.
>
> Only it's not quite enough - O_RDONLY open() can bloody well race with
> ->unfreeze_fs() and pick dentry before we hit
> inode_lock(inode);
> inode->i_private = entry;
> i_size_write(inode, size + sizeof(__u32)); /* attributes + data */
> inode_unlock(inode);
> in efivarfs_create_dentry(), oopsing on
> static int efivarfs_file_open(struct inode *inode, struct file *file)
> {
> struct efivar_entry *entry = inode->i_private;
>
> file->private_data = entry;
>
> inode_lock(inode);
> entry->open_count++;
> this.
I don't see the problem.
An O_RDONLY open will either find something in the dcache which will be
fully initialised, or it will hit simple_lookup() which will make the
dentry negative, and the open will fail.
If the lookup races with efivarfs_check_missing() calling
efivarfs_create_dentry() then you could end up with two dentries in the
dcache with the same name. One from open/simple_lookup/d_add which is
negative and one from efivarfs_create_dentry/d_alloc/d_make_persistent
which is fully initialised (d_make_persistent() isn't called until after
->i_private is initialised).
The former will be negative and later in the hash chain.
The latter will be positive and earlier in the hash chain.
The former will soon disappear because of DCACHE_DONTCACHE.
The latter will be found by subsequent lookups.
This might be a bit unorthodox, but is not problematic.
>
> When the only reason for your extension is something that flat-out violates
> the conditions of use you are putting into comment in the same commit...
Yes, that condition is a little too simplistic.
* d_alloc_name() allocates a dentry without any protection against
* races. It should only be used in directories that use simple_lookup
* and when create/rename/link inode operations cannot be called, either
* because they don't exist or because some locking prevents it.
* d_alloc_name() is particularly suited for use with
* simple_dir_inode_operations but can also be used freely in
* fill_super/get_tree context and with some care when the filesystem
* is frozen.
*
* The result of d_alloc_name() is typically passed to d_make_persistent().
>
> That crap needs to be straightened out, but I would rather not mix that
> into your series.
>
I can drop this change for now. I only wanted it so that I could remove
the EXPORT of d_alloc() but I am comfortable delaying that.
Thanks,
NeilBrown