Re: [RFC PATCH -v4 00/14] fsnotify, dnotify, and inotify

From: C. Scott Ananian
Date: Mon Dec 29 2008 - 13:20:00 EST


On Mon, Dec 22, 2008 at 3:53 PM, Eric Paris <eparis@xxxxxxxxxx> wrote:
> fanotify doesn't have directory notifications. You get notifications
> about the individual inodes. I don't have mv tracking, and I'm not sure
> how much trouble it'll be. Maybe not that bad since a mv doesn't have
> to deliver the fd to userspace. I promise to think about how to make
> this better for you. In any case, you can get pathnames, dev, and inode
> very quickly.

For consistency, directory events could appear as mutations
(write/close_was_writable) of the directory inodes, with an open file
descriptor to the directory.

It would be a bit annoying, but you wouldn't even have to pass the
'from' and 'to' names. You can then readdir the open file descriptor
to the directory find out what's changed for yourself. (If you do
pass 'from' and 'to' path information, then the open file descriptor
to the directory is unnecessary.)

Capturing a coherent directory listing on startup (or while processing
a directory mutation) is the other tricky bit; being able to
block-for-ack on a rename would make that much easier, returning
SEND_DELAY for example. The basic ideas seem coherent with what
you've got in fanotify. A write_need_access_decision on directories
would allow userspace locking during the scan, and you'd set a
fastpath (with SURVIVE_WRITE cleared) after the scan was complete.

The write_need_access_decision method would, as testiment to the
generality of the mechanism, allow userspace implementations of
copy-on-write and versioning file systems. A versioning filesystem
could check in the 'new version' of a file on close_was_writable;
write_need_access would just need to block until userland had
completed the checkin to prevent races between kernel and userland.
For a copy-on-write system based on hardlinks,
write_need_access_decision would break a hardlink if necessary, and
regardless install a fast path skipping future checks.

> In any case, event coallessing seems like it needs to work by walking
> the notification queue starting at the back and working forwards. If
> you find a duplicate just drop. If you find a mv, place this one at the
> end. Kinda sucks that we are taking and O(1) operation and making it
> O(n).
>
> At least with fanotify you don't have mv races, since you have an open
> fd which still gives you the access you need even if it mv'd.

On reflection event coalescing is not necessary so long as you're
passing open file descriptors around. The kernel mechanisms are
sufficient to ensure that the file descriptors are still valid even if
unlink/move operations occur after the mutation.

To briefly summarize my fanotify wishlist:
1) write/close_was_writable events on directory inodes triggered by
create/rename/unlink.
2) write_need_access_decision event applicable to both regular files
and directories.
--scott

--
( http://cscott.net/ )
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/