Re: [PATCH v3 05/19] VFS: introduce d_alloc_noblock()
From: NeilBrown
Date: Tue Apr 28 2026 - 07:47:35 EST
On Tue, 28 Apr 2026, Al Viro wrote:
> On Mon, Apr 27, 2026 at 02:01:23PM +1000, NeilBrown wrote:
> > From: NeilBrown <neil@xxxxxxxxxx>
> >
> > Several filesystems use the results of readdir to prime the dcache.
> > These filesystems use d_alloc_parallel() which can block if there is a
> > concurrent lookup. Blocking in that case is pointless as the lookup
> > will add info to the dcache and there is no value in the readdir waiting
> > to see if it should add the info too.
>
> ... except that there is - large part of the reasons for that in the original
> user (procfs) is that we want getdents() + open() + fstat() + compare ->st_ino
> from fstat() with ->d_ino from getdents() to work, even if you race with lookup
> from another process coming in the middle of your getdents().
>
> What are your plans in that area?
>
I can't see that procfs needs the directory to be locked at all.
So we could drop the parent lock at the start of readdir and
reclaim at the end. But I decided to just drop it in the rare case
where it could cause a problem.
Below is that patch I have to procfs.
Thanks,
NeilBrown
Author: NeilBrown <neil@xxxxxxxxxx>
Date: Mon Mar 16 16:37:52 2026 +1100
procfs: drop parent lock for d_alloc_parallel() in iterate_shared()
When procfs finds a name in iterate_shared() that isn't in the dcache it
*must* add it so that it can have a stable inode number to report
(inodes are only accessible from the dcache in procfs).
It uses d_alloc_parallel(). A planned change to locking will make it
unsafe to call d_alloc_parallel() while holding the directory lock.
Other filesystems which prime the dcache in iterate_shared() use
d_alloc_noblock() which is safe but can fail if it races with ->lookup.
As procfs cannot handle failure we need something better.
procfs doesn't *need* the parent to be locked. There are no shared data
structures accessed that don't have their own locking. So it is safe to
drop and re-take the parent lock. We could do this around the whole
iteration, but as failure of d_alloc_noblock() is rare it is more
efficient to drop and retake it just around a call of d_alloc_parallel()
when d_alloc_noblock does fail.
Other code that drops and retakes the lock in iterate_shared needs to
be careful to check S_DEAD which could be set while the lock is
dropped. This is not needed in procfs as the flag is never set.
As d_alloc_noblock() calls try_lookup_noperm(), we can skip that call
and simplify the code.
Signed-off-by: NeilBrown <neil@xxxxxxxxxx>
diff --git a/fs/proc/base.c b/fs/proc/base.c
index d55a4b603188..feed586fa736 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2127,24 +2127,32 @@ bool proc_fill_cache(struct file *file, struct dir_context *ctx,
unsigned type = DT_UNKNOWN;
ino_t ino = 1;
- child = try_lookup_noperm(&qname, dir);
+ child = d_alloc_noblock(dir, &qname);
if (IS_ERR(child))
goto end_instantiate;
- if (!child) {
+ if (child == ERR_PTR(-EWOULDBLOCK)) {
+ /*
+ * Need to drop directory lock, which isn't really
+ * needed here anyway. As rmdir never happens in procfs
+ * we don't need to be concerned about S_DEAD being set
+ * while unlocked.
+ */
+ inode_unlock_shared(dir->d_inode);
child = d_alloc_parallel(dir, &qname);
- if (IS_ERR(child))
- goto end_instantiate;
- if (d_in_lookup(child)) {
- struct dentry *res;
- res = instantiate(child, task, ptr);
- d_lookup_done(child);
- if (unlikely(res)) {
- dput(child);
- child = res;
- if (IS_ERR(child))
- goto end_instantiate;
- }
+ inode_lock_shared(dir->d_inode);
+ }
+ if (IS_ERR(child))
+ goto end_instantiate;
+ if (d_in_lookup(child)) {
+ struct dentry *res;
+ res = instantiate(child, task, ptr);
+ d_lookup_done(child);
+ if (unlikely(res)) {
+ dput(child);
+ child = res;
+ if (IS_ERR(child))
+ goto end_instantiate;
}
}
inode = d_inode(child);
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 04a382178c65..c19e1ccd6bb9 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -686,29 +686,34 @@ static bool proc_sys_fill_cache(struct file *file,
ino_t ino = 0;
unsigned type = DT_UNKNOWN;
- qname.name = table->procname;
- qname.len = strlen(table->procname);
- qname.hash = full_name_hash(dir, qname.name, qname.len);
-
- child = d_lookup(dir, &qname);
- if (!child) {
+ qname = QSTR(table->procname);
+ child = d_alloc_noblock(dir, &qname);
+ if (child == ERR_PTR(-EWOULDBLOCK)) {
+ /*
+ * Need to drop directory lock, which isn't really
+ * needed here anyway. As rmdir never happens in procfs
+ * we don't need to be concerned about S_DEAD being set
+ * while unlocked.
+ */
+ inode_unlock_shared(dir->d_inode);
child = d_alloc_parallel(dir, &qname);
- if (IS_ERR(child))
- return false;
- if (d_in_lookup(child)) {
- struct dentry *res;
- inode = proc_sys_make_inode(dir->d_sb, head, table);
- res = d_splice_alias_ops(inode, child,
- &proc_sys_dentry_operations);
- d_lookup_done(child);
- if (unlikely(res)) {
- dput(child);
-
- if (IS_ERR(res))
- return false;
-
- child = res;
- }
+ inode_lock_shared(dir->d_inode);
+ }
+ if (IS_ERR(child))
+ return false;
+ if (d_in_lookup(child)) {
+ struct dentry *res;
+ inode = proc_sys_make_inode(dir->d_sb, head, table);
+ res = d_splice_alias_ops(inode, child,
+ &proc_sys_dentry_operations);
+ d_lookup_done(child);
+ if (unlikely(res)) {
+ dput(child);
+
+ if (IS_ERR(res))
+ return false;
+
+ child = res;
}
}
inode = d_inode(child);