Re: [PATCH 0/2] fsnotify: fix softlockups iterating over d_subdirs
From: Amir Goldstein
Date: Tue Oct 18 2022 - 04:08:11 EST
On Tue, Oct 18, 2022 at 7:12 AM Stephen Brennan
<stephen.s.brennan@xxxxxxxxxx> wrote:
>
> Hi Jan, Amir, Al,
>
> Here's my first shot at implementing what we discussed. I tested it using the
> negative dentry creation tool I mentioned in my previous message, with a similar
> workflow. Rather than having a bunch of threads accessing the directory to
> create that "thundering herd" of CPUs in __fsnotify_update_child_dentry_flags, I
> just started a lot of inotifywait tasks:
>
> 1. Create 100 million negative dentries in a dir
> 2. Use trace-cmd to watch __fsnotify_update_child_dentry_flags:
> trace-cmd start -p function_graph -l __fsnotify_update_child_dentry_flags
> sudo cat /sys/kernel/debug/tracing/trace_pipe
> 3. Run a lot of inotifywait tasks: for i in {1..10} inotifywait $dir & done
>
> With step #3, I see only one execution of __fsnotify_update_child_dentry_flags.
> Once that completes, all the inotifywait tasks say "Watches established".
> Similarly, once an access occurs in the directory, a single
> __fsnotify_update_child_dentry_flags execution occurs, and all the tasks exit.
> In short: it works great!
>
> However, while testing this, I've observed a dentry still in use warning during
> unmount of rpc_pipefs on the "nfs" dentry during shutdown. NFS is of course in
> use, and I assume that fsnotify must have been used to trigger this. The error
> is not there on mainline without my patch so it's definitely caused by this
> code. I'll continue debugging it but I wanted to share my first take on this so
> you could take a look.
>
> [ 1595.197339] BUG: Dentry 000000005f5e7197{i=67,n=nfs} still in use (2) [unmount of rpc_pipefs rpc_pipefs]
>
Hmm, the assumption we made about partial stability of d_subdirs
under dir inode lock looks incorrect for rpc_pipefs.
None of the functions that update the rpc_pipefs dcache take the parent
inode lock.
The assumption looks incorrect for other pseudo fs as well.
The other side of the coin is that we do not really need to worry
about walking a huge list of pseudo fs children.
The question is how to classify those pseudo fs and whether there
are other cases like this that we missed.
Perhaps having simple_dentry_operationsis a good enough
clue, but perhaps it is not enough. I am not sure.
It covers all the cases of pseudo fs that I know about, so you
can certainly use this clue to avoid going to sleep in the
update loop as a first approximation.
I can try to figure this out, but I prefer that Al will chime in to
provide reliable answers to those questions.
Thanks,
Amir.