Re: [PATCH v3 04/19] VFS: use wait_var_event for waiting in d_alloc_parallel()
From: Al Viro
Date: Thu Apr 30 2026 - 23:37:32 EST
On Fri, May 01, 2026 at 11:39:59AM +1000, NeilBrown wrote:
> I dislike them too. I doubt I can find solutions that either of us
> like, but they should be relatively short-lived. Once we push the
> locking down in the the inode_operations the filesystem will be in a
> position to hold the lock only when it actually needs it (if at all).
... or we'll end up with hard-to-formulate constraints on what a filesystem
may do with its internal locking to use the APIs provided by fs/{dcache,namei}.c
safely.
Note that e.g. "->iterate_shared() wants to know the synthetic inumbers
a concurrent stat(2) would inject into dcache" (which is the original
reason for dcache preseeding in that thing) is not uncommon. In procfs
you are lucky to have no mkdir() and friends; the same is not true in
general and we'd better have a sane answer to "what could a filesystem
like that do with its internal locking". Or that thing will get blindly
copied, with predictable results.
> I'm confident that dropping the lock is safe. If there was some way to
> tell the VFS that the lock has already been dropped, then we wouldn't
> need to reclaim it, but I cannot see a clean way to do that.
FWIW, I'm more concerned about ->iterate_shared() - d_add_ci() is garbage
that isn't used on a sanely configured kernel; ls -lR is not going away,
no matter what, and exclusion requirements are going to be a lot more
interesting for that one anyway. It might be worth teaching iterate_dir()
that in such-and-such conditions it ought to save position, drop the lock,
do a lookup on name stashed in dir_context, retake the lock and call back into
->iterate_shared() from saved position. With helper callable by ->iterate_shared()
instances if they run into failing d_alloc_trylock() in a situation where they
can't just shrug and move on... Not sure.
What kind of exclusion do you have in mind for foo_iterate_shared() in the
long run? Assuming that filesystem has directory-modifying operations, as
well as lookups, and its inumbers are synthetic.
BTW, do you have AFS and CIFS counterparts of your stuff from back in 2022
that killed d_rehash() uses in fs/nfs? I would love to kill d_rehash();
exfat use is an easily removable junk, but fs/afs and fs/smb/client ones
are trickier and the reasons why it needed to be killed in fs/nfs apply
to those as well.