Re: [RFC 1/2] fanotify: new event FAN_MODIFY_DIR

From: Amir Goldstein
Date: Tue Mar 14 2017 - 09:55:45 EST


On Tue, Mar 14, 2017 at 2:41 PM, Filip ÅtÄdronskà <r.lkml@xxxxxxxxxx> wrote:
> Hi,
>
> On Tue, Mar 14, 2017 at 12:11:40PM +0200, Amir Goldstein wrote:
>> > - file system indexers / desktop search tools
>> > - file synchronization tools (like Dropbox, Nextcloud, etc.),
>> > online backup tools
>>
>> This last one is the use case of my employer, Ctera Networks.
>> Out of curiosity, what is the use case that you are focusing on?
>
> I'm working on a file synchronization tool as part of my bachelor
> thesis at Charles University.
>
> When I started (now over a year ago ... long story), there were AFIK no
> attempts at solving the recursive inotify issue, only a lot of
> complaints, so I cobbled up together something simple (I'm not a kernel
> developer by trade, this was my first patch) that would allow me to
> work on the userspace parts, which are the main bulk.
>

That sounds very useful. I was actually looking for an open userspace
implementation were I could demonstrate my patches, but all the projects
I looked at (beagle etc..) seems to have died from desperation waiting
for intotify scalabiltiy to be fixed...

> I try to focus on algorithmic and implementation efficiency as opposed
> to fancy GUIs and similar. I want it to be fast with 100k directories
> and 10M files in home dir. But it's very WIP so we'll see how that turns
> out.
>

I did some basic tests of my super block watch with 1M directories,
and currently the code is under integration testing in our company.

I prepared a drop-in replacement for our fs change listener class
that was previously implemented with inotify.
Currently, the userspace code still "thinks" it is adding watches on directories
recursively, but what it really does is just gets all their file handles.
Then, the code that maps an event to "watch descriptor" actually matches
the fhandle in the event to the directory fhandle (much like in your example).

If you are not married to using your kernel patch, you can just go a head and
try my patch set.
You can fetch it directly from:
https://github.com/amir73il/linux/tree/fanotify_sb

It has been recently rebased over v4.11-rc1.

There are also ready ports for stable kernels v4.1 v4.4 and v4.9
in my github if that serves you better.

Marko Rauhamaa [CCed] has also tested my patches to his satisfaction.


>> I had the feeling that all recursive inotify users are hiding in the shadows,
>> but was missing more concrete evidence.
>
> Yes, even Dropbox uses inotify. They have articles in their help on how
> to increase inotify.max_user_watches: https://www.dropbox.com/help/145.
> That's not good PR. ;-) (I'm not affiliated with DB, just pointing that
> out.)
>
>> About the argument of not having to change in-kernel framework,
>> I don't think it should be a consideration at all.
>
> Understood. I tried to stay conservative and non-controversial because I
> imagined that radical framework changes would take months of discussions
> (look at the history of statx) and this issue seems to be pressing for
> quite a lot of people. But rushing is of course not the best strategy
> either, there are always compromises.
>

Don't get me wrong, I certainly do *understand* why you would want to stay
away from making radical framework changes.
But I believe it is the best interest of the community to choose the
best solution
going forward while eliminating the fear-of-change factor.
This is why I say that fear of change SHOULD not be a consideration.

>> If you don't specify FAN_EVENT_INFO_NAME, you can get filename events
>> FAN_MOVE|FAN_CREATE|FAN_DELETE without the name.
>>
>> What you do get is the file descriptor of the parent. sounds familiar? ;-)
>
> I didn't notice this bit. That sounds like a win-win.
>

Well, I may me giving you half the truth here.
With current patches not specifying FAN_EVENT_INFO_NAME
will mask out the filename events from the mark, but
1. The implementation could be easily changed
2. You can set FAN_EVENT_INFO_NAME and ignore the
filename info in userspace. You still get the parent fd
as your program expects and you get a bonus fhandle
of the parent for free with the event, see my test program:

github.com/amir73il/fsnotify-utils/blob/master/src/test/fanotify_demo.c#L101

Please let me know if that is sufficient for your needs
or if you need me to prepare a version that delivers filename events
without filename info, therefore allowing to merge filename events.

Sounds to me like if you get an event FAN_DELETE, "aaa",
your implementation can update the db directly without having
to scan the directory, so it should be useful.
For your consideration.

Amir.