Re: [PATCH 0/2] fsnotify: fix softlockups iterating over d_subdirs

From: Stephen Brennan
Date: Thu Oct 27 2022 - 18:07:19 EST


Amir Goldstein <amir73il@xxxxxxxxx> writes:
> On Wed, Oct 19, 2022 at 2:52 AM Stephen Brennan
> <stephen.s.brennan@xxxxxxxxxx> wrote:
>>
>> Amir Goldstein <amir73il@xxxxxxxxx> writes:
>> > On Tue, Oct 18, 2022 at 7:12 AM Stephen Brennan
>> > <stephen.s.brennan@xxxxxxxxxx> wrote:
>> >>
>> >> Hi Jan, Amir, Al,
>> >>
>> >> Here's my first shot at implementing what we discussed. I tested it using the
>> >> negative dentry creation tool I mentioned in my previous message, with a similar
>> >> workflow. Rather than having a bunch of threads accessing the directory to
>> >> create that "thundering herd" of CPUs in __fsnotify_update_child_dentry_flags, I
>> >> just started a lot of inotifywait tasks:
>> >>
>> >> 1. Create 100 million negative dentries in a dir
>> >> 2. Use trace-cmd to watch __fsnotify_update_child_dentry_flags:
>> >> trace-cmd start -p function_graph -l __fsnotify_update_child_dentry_flags
>> >> sudo cat /sys/kernel/debug/tracing/trace_pipe
>> >> 3. Run a lot of inotifywait tasks: for i in {1..10} inotifywait $dir & done
>> >>
>> >> With step #3, I see only one execution of __fsnotify_update_child_dentry_flags.
>> >> Once that completes, all the inotifywait tasks say "Watches established".
>> >> Similarly, once an access occurs in the directory, a single
>> >> __fsnotify_update_child_dentry_flags execution occurs, and all the tasks exit.
>> >> In short: it works great!
>> >>
>> >> However, while testing this, I've observed a dentry still in use warning during
>> >> unmount of rpc_pipefs on the "nfs" dentry during shutdown. NFS is of course in
>> >> use, and I assume that fsnotify must have been used to trigger this. The error
>> >> is not there on mainline without my patch so it's definitely caused by this
>> >> code. I'll continue debugging it but I wanted to share my first take on this so
>> >> you could take a look.
>> >>
>> >> [ 1595.197339] BUG: Dentry 000000005f5e7197{i=67,n=nfs} still in use (2) [unmount of rpc_pipefs rpc_pipefs]
>> >>
>> >
>> > Hmm, the assumption we made about partial stability of d_subdirs
>> > under dir inode lock looks incorrect for rpc_pipefs.
>> > None of the functions that update the rpc_pipefs dcache take the parent
>> > inode lock.
>>
>> That may be, but I'm confused how that would trigger this issue. If I'm
>> understanding correctly, this warning indicates a reference counting
>> bug.
>
> Yes.
> On generic_shutdown_super() there should be no more
> references to dentries.
>
>>
>> If __fsnotify_update_child_dentry_flags() had gone to sleep and the list
>> were edited, then it seems like there could be only two possibilities
>> that could cause bugs:
>>
>> 1. The dentry we slept holding a reference to was removed from the list,
>> and maybe moved to a different one, or just removed. If that were the
>> case, we're quite unlucky, because we'll start looping indefinitely as
>> we'll never get back to the beginning of the list, or worse.
>>
>> 2. A dentry adjacent to the one we held a reference to was removed. In
>> that case, our dentry's d_child pointers should get rearranged, and when
>> we wake, we should see those updates and continue.
>>
>> In neither of those cases do I understand where we could have done a
>> dget() unpaired with a dput(), which is what seemingly would trigger
>> this issue.
>>
>
> I got the same impression.

Well I feel stupid. The reason behind this seems to be... that
d_find_any_alias() returns a reference to the dentry, and I promptly
leaked that. I'll have it fixed in v3 which I'm going through testing
now.

Stephen