Re: KCSAN: data-race in __d_drop / retain_dentry

From: Al Viro

Date: Wed Apr 08 2026 - 19:30:40 EST


On Thu, Apr 09, 2026 at 12:12:40AM +0100, Al Viro wrote:
> On Wed, Mar 11, 2026 at 04:02:41PM +0800, Jianzhou Zhao wrote:
>
> > Execution Flow & Code Context
> > When a dentry receives its final decrement during a path operation (e.g., inside `dput`), its lifecycle might traverse `__dentry_kill()` leading to `__d_drop()`. Here, VFS manually eradicates the dentry from the hash list by assigning `NULL` to the internal double-linked list pointer tracker `pprev`:
> > ```c
> > // fs/dcache.c
> > void __d_drop(struct dentry *dentry)
> > {
> > if (!d_unhashed(dentry)) {
> > ___d_drop(dentry);
> > ...
> > dentry->d_hash.pprev = NULL; // <-- Plain concurrent write
> > write_seqcount_invalidate(&dentry->d_seq);
> > }
> > }
> > ```
> >
> > Simultaneously, another thread undergoing an optimistic lockless `dput`
>
> Without having held the reference it's dropping?

Note that if the sequence is
A: fast_dput(): count 1->0
B: grab reference, count 0->1
B: drop, reference, count 1->0, grab ->d_lock and proceedi to __dentry_kill()
B: in __dentry_kill() set count negative
B: in __dentry_kill() clear ->d_hash.pprev
A: call retain_dentry()
which is legitimate, not noticing d_unhashed() in retain_dentry() is fine -
fast_dput() will proceed to
spin_lock(&dentry->d_lock);
if (dentry->d_lockref.count || retain_dentry(dentry, true)) {
notice that ->d_lockref.count is negative and bugger off to
spin_unlock(&dentry->d_lock);
return true;
with rcu_read_lock() still held, same as it would if retain_dentry()
had returned true.

See the comments in fast_dput(), specifically
/*
* Did somebody else grab a reference to it in the meantime, and
* we're no longer the last user after all? Alternatively, somebody
* else could have killed it and marked it dead. Either way, we
* don't need to do anything else.
*/