Re: [PATCH 11/19] VFS: Add ability to exclusively lock a dentry and use for create/remove operations.

From: Al Viro
Date: Sun Feb 09 2025 - 01:40:47 EST


On Thu, Feb 06, 2025 at 04:42:48PM +1100, NeilBrown wrote:

> +bool d_update_lock(struct dentry *dentry,
> + struct dentry *base, const struct qstr *last,
> + unsigned int subclass)
> +{
> + lock_acquire_exclusive(&dentry->d_update_map, subclass, 0, NULL, _THIS_IP_);
> +again:
> + spin_lock(&dentry->d_lock);
> + wait_var_event_spinlock(&dentry->d_flags,
> + !check_dentry_locked(dentry),
> + &dentry->d_lock);
> + if (d_is_positive(dentry)) {
> + rcu_read_lock(); /* needed for d_same_name() */

It isn't. You are holding ->d_lock there.

> + if (
> + /* Was unlinked while we waited ?*/
> + d_unhashed(dentry) ||
> + /* Or was dentry renamed ?? */
> + dentry->d_parent != base ||
> + dentry->d_name.hash != last->hash ||
> + !d_same_name(dentry, base, last)

Negatives can't be moved, but they bloody well can be unhashed. So skipping
the d_unhashed() part for negatives is wrong.

> + ) {
> + rcu_read_unlock();
> + spin_unlock(&dentry->d_lock);
> + lock_map_release(&dentry->d_update_map);
> + return false;
> + }
> + rcu_read_unlock();
> + }
> + /* Must ensure DCACHE_PAR_UPDATE in child is visible before reading
> + * from parent
> + */
> + smp_store_mb(dentry->d_flags, dentry->d_flags | DCACHE_PAR_UPDATE);

... paired with?

> + if (base->d_flags & DCACHE_PAR_UPDATE) {
> + /* We cannot grant DCACHE_PAR_UPDATE on a dentry while
> + * it is held on the parent
> + */
> + dentry->d_flags &= ~DCACHE_PAR_UPDATE;
> + spin_unlock(&dentry->d_lock);
> + spin_lock(&base->d_lock);
> + wait_var_event_spinlock(&base->d_flags,
> + !check_dentry_locked(base),
> + &base->d_lock);

Oh? So you might also be waiting on the parent? That's a deadlock fodder right
there - caller might be holding ->i_rwsem on the same parent, so you have waiting
on _->d_flags nested both outside and inside _->d_inode->i_rwsem.

Just in case anyone goes "->i_rwsem will only be held shared" - that wouldn't help.
Throw fchmod() into the mix and enjoy your deadlock -
A: holds ->i_rwsem shared, waits for C to clear DCACHE_PAR_UPDATE.
B: blocked trying to grab ->i_rwsem exclusive
C: has DCACHE_PAR_UPDATE set, is blocked trying to grab ->i_rwsem shared
and there you go...

> + spin_unlock(&base->d_lock);
> + goto again;
> + }
> + spin_unlock(&dentry->d_lock);
> + return true;
> +}

The entire thing is refcount-neutral for both dentry and base. Which makes this

> @@ -1759,8 +1863,9 @@ static struct dentry *lookup_and_lock_nested(const struct qstr *last,
>
> if (!(lookup_flags & LOOKUP_PARENT_LOCKED))
> inode_lock_nested(base->d_inode, subclass);
> -
> - dentry = lookup_one_qstr(last, base, lookup_flags);
> + do {
> + dentry = lookup_one_qstr(last, base, lookup_flags);
> + } while (!IS_ERR(dentry) && !d_update_lock(dentry, base, last, subclass));

... a refcount leak waiting to happen.