Re: PROBLEM: FUSE_NOTIFY_INVAL_ENTRY leaves stale negative dentry after c9ba789dad15
From: NeilBrown
Date: Mon Jun 01 2026 - 19:43:40 EST
On Mon, 01 Jun 2026, Miklos Szeredi wrote:
> [Cc: Neil]
>
> On Sun, 31 May 2026 at 03:12, Артем Лабазов <123321artyom@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > I am reporting a FUSE/VFS regression introduced by:
> >
> > c9ba789dad15ba65662bba17595c0aeaa0cfcf1c
> > VFS: introduce start_creating_noperm() and start_removing_noperm()
> >
> > After a FUSE filesystem returns a cached negative lookup for a name, then
> > later makes that name visible and sends FUSE_NOTIFY_INVAL_ENTRY for it,
> > readdir() can list the entry while stat()/open() on the same path still
> > returns ENOENT.
> >
> > Bisect result:
> >
> > good: v6.17
> > bad: v6.19
> >
> > first bad commit:
> > c9ba789dad15ba65662bba17595c0aeaa0cfcf1c
> > VFS: introduce start_creating_noperm() and start_removing_noperm()
> >
> > Adjacent commits:
> >
> > bd6ede8a06e89ca5a94a8b51cea792705d1b8ca2 GOOD
> > c9ba789dad15ba65662bba17595c0aeaa0cfcf1c BAD
> >
> > The bad commit changes fs/fuse/dir.c:fuse_reverse_inval_entry() to use
> > start_removing_noperm().
> >
> > I attached a minimal libfuse3 reproducer. It does:
> >
> > 1. lookup("config") returns ino=0 with non-zero entry_timeout, causing
> > the kernel to cache a negative dentry.
> > 2. The filesystem makes "config" visible.
> > 3. It calls fuse_lowlevel_notify_inval_entry(root, "config").
> > 4. The test checks whether readdir lists "config" and stat("config")
> > succeeds.
>
> Yeah, issue is that start_removing_* all work on a POSITIVE dentry.
>
> FUSE_NOTIFY_INVAL_ENTRY works on both positive and negative dentries,
> doesn't care about the inode (or lack of it) at all.
>
> I've not dug myself into this work, so not sure what the concept here
> is, or how this should be fixed, but this is definitely not (just) a
> removal op.
This code is quite unique. It is interactive with the dcache, not the
backing store - which makes sense for fuse but is unlike other
filesystems.
It doesn't do a lookup - just finds things in the dcache which is then
invalidates.
This means the positive an negative cases are quite different and I
think it works well to treat them separately.
For a negative dentry, we can use ->d_lock to avoid races with the
dentry becoming positive. d_invalidate() becomes __d_drop() for a
negative dentry, so everything we need to do is atomic.
For a positive dentry we still need to get the directory lock, and
start_removing_dentry() works find in that case.
So the following patch is what I would suggest. It probably deserves to
have some comments sprinkled through to make it clear what is happening.
If people could review/test then if I get positive feedback I can make a
proper patch.
Thanks,
NeilBrown
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index b658b6baf72f..3f16ecbc358e 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1605,13 +1605,29 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
struct dentry *child = try_lookup_noperm(name, dir);
if (!child || IS_ERR(child))
goto put_parent;
- entry = start_removing_dentry(dir, child);
- dput(child);
- if (IS_ERR(entry))
- goto put_parent;
- if (!d_same_name(entry, dir, name)) {
- end_removing(entry);
- entry = NULL;
+ if (d_really_is_negative(child)) {
+ spin_lock(&child->d_lock);
+ if (d_really_is_negative(child)) {
+ fuse_dir_changed(parent);
+ if (!(flags & FUSE_EXPIRE_ONLY))
+ __d_drop(entry);
+ fuse_invalidate_entry_cache(entry);
+ spin_unlock(&child->d_lock);
+ dput(child);
+ err = 0;
+ goto put_parent;
+ }
+ spin_unlock(&child->d_lock);
+ dput(child);
+ } else {
+ entry = start_removing_dentry(dir, child);
+ dput(child);
+ if (IS_ERR(entry))
+ goto put_parent;
+ if (!d_same_name(entry, dir, name)) {
+ end_removing(entry);
+ entry = NULL;
+ }
}
}