Re: [PATCH v3 04/19] VFS: use wait_var_event for waiting in d_alloc_parallel()

From: NeilBrown

Date: Tue Apr 28 2026 - 09:02:22 EST


On Tue, 28 Apr 2026, Al Viro wrote:
> On Mon, Apr 27, 2026 at 02:01:22PM +1000, NeilBrown wrote:
> > From: NeilBrown <neil@xxxxxxxxxx>
> >
> > d_alloc_parallel() currently requires a wait_queue_head to be passed in.
> > This must have a life time which extends until the lookup is completed.
> >
> > This makes it awkward to use and particularly make it hard to use in
> > lookup_one_qstr_excl() which I hope to do in the future.
> >
> > This patch changes d_alloc_parallel() to use wake_up_var_locked() to
> > wake up waiters, and wait_var_event_spinlock() to wait. dentry->d_lock
> > is used for synchronisation as it is already held and the relevant
> > times.
> >
> > In most cases there will be no waiters so the wake_up_var_locked()
> > call is a complete waste. To optimise this a new ->d_flags flag is
> > added: DCACHE_LOCK_WAITERS. This is set whenever any thread prepares to
> > wait for the dentry, and if it isn't set when DCACHE_PAR_LOOKUP is
> > cleared, no wakeup is sent.
> > (The name is deliberately generic as I plan to replace DCACHE_PAR_LOOKUP
> > with more generic per-dentry locking in the future).
> >
> > __d_lookup_unhash() now returns a bool rather than a wq. This is true
> > if DCACHE_LOCK_WAITERS was sent and is used to decide to send the wake
> > up. It would be easier to send the wakeup immediately when clearing
> > DCACHE_LOCK_WAITERS, but then the waiter could wake a bit earlier and
> > then spend time spinning on ->d_lock. I don't know if that cost is
> > interesting.
>
> I definitely like the calling conventions change, so much that I'd be glad
> to pick that one Right Fucking Now. I'd probably make the store in
> d_must_wait() conditional, though - ->d_flags and ->d_lock are in different
> cachelines and there's no need to dirty both every time we are called.
> IOW, have d_must_wait() do this:
> if (!d_in_lookup(dentry))
> return false;
> if (!(dentry->d_flags & DCACHE_LOCK_WAITER))
> dentry->d_flags |= DCACHE_LOCK_WAITER;
> return true;

The only time that DCACHE_LOCK_WAITER will already be set is when there
are two (or more) waiters as well as the thread they are waiting on.
That means three (or more) threads all accessing the same name at the
same time. How often does that happen? Is the micro-optimisation worth
the small increase in code size?

> Not sure if it would flow better with return values inverted (and
> negation removed from wait_var_event_spinlock(), that is)...

I think the central question is what we would call the function.

"d_lookup_finished()" might work though it could be confused with
d_lookup_done() which is quite different.
But then it would be less obvious what DCACHE_LOCK_WAITER is being set.
Having "wait" in the name ("d_must_wait") makes that a little clearer I
think.
So while I agree it would be nicer to not have the negation in the
wait_var_event_spinlock, I can't think of a name that would result in a
net improvement.

>
> > static inline void end_dir_add(struct inode *dir, unsigned int n,
> > - wait_queue_head_t *d_wait)
> > + bool do_wake, struct dentry *de)
> > {
> > smp_store_release(&dir->i_dir_seq, n + 2);
> > preempt_enable_nested();
> > - if (wq_has_sleeper(d_wait))
> > - wake_up_all(d_wait);
> > + if (do_wake)
> > + wake_up_var_locked(&de->d_flags, &de->d_lock);
> > }
>
> This calling conventions change, OTOH, I don't like at all. I mean,
> (dir, n, false, unused) vs. (dir, n, true, never_NULL) is seriously
> asking to be reduced to (dir, n, NULL) vs. (dir, n, never_NULL).

In the callers it is cleaner to keep the do_wake flag separate.
So the call
end_dir_add(dir, n, do_wake, dentry)
would become
end_dir_add(dir, n, do_wake ? dentry : NULL)

which isn't beautiful but is probably OK and worth it for the clarity
added to end_dir_add().

I'll make that change.

>
> > @@ -2800,29 +2793,29 @@ EXPORT_SYMBOL(d_alloc_parallel);
> > * - Retrieve and clear the waitqueue head in dentry
> > * - Return the waitqueue head
>
> ... not anymore. The entire comment needs replacement, TBH - 2 lines of
> 3 are stale with that patch and remaining one is... ambiguous, since
> there are *two* hashes around and "Unhash the dentry" usually refers to
> the other one.
>
> Something like
>
> /*
> * Move dentry from in-lookup state to busy-negative one.
> *
> * From now on d_in_lookup(dentry) will return false and dentry is gone from
> * in-lookup hash.
> *
> * Anyone who had been waiting on it in d_alloc_parallel() is free to
> * proceed after that. Note that waking such waiters up is left to
> * the callers; we might be called in write-side critical area for ->i_dir_seq,
> * and PREEMPT_RT kernels can't have that wakeup done in those.
> *
> * Returns whether there are waiters to be woken up.
> */
>
> perhaps?
>

Yes, something like that. I'll look at it again in the morning and
see what seems best.

Thanks,
NeilBrown