fanotify as syscalls

From: Eric Paris
Date: Mon Sep 14 2009 - 15:08:43 EST


Long ago I implemented fanotify as basically a /dev interface using
ioctls(). Alan suggested I use a socket protocol and could then make
use of get/setsockopt() which although still not great is light years
better than ioctl. Currently the fanotify interface as I want to push
it to Linus and as I've been requesting comments on for the last 1.25
years is just that. It really makes no use of the networking system
other than bind() and setsockopt() and everyone tends to agree the
things I want to do can't reasonably be done using network hooks and a
'real' socket protocol. I like this interface, setsockopt() makes it so
easy to add new functionality as we flush out other users. fanotify as
it stands today has a number of groups who will port to it, has a nubmer
of advantages over inotify, and I have been told privately meets the
needs of the original group of people who paid me to work on it (two
very large anti-malware companies who currently unprotect and hack the
syscall table of their users)

Just this week I got another request to look at syscalls. So I did, I
haven't prototyped it, but I can do it with syscalls, they would look
like this:

int fanotify_init(int flags, int f_flags, __u64 mask, unsigned int priority);

int fanotify_add_mark(int fanotify_fd, char *path, __u64 mask, __u64 ignored_mask);
int fanotify_add_mark_fd(int fanotify_fd, int fd, __u64 mask, __u64 ignored_mask);
int fanotify_rm_mark(int fanotify_fd, char *path, __u64 mask);
int fanotify_rm_mark_fd(int fanotify_fd, int fd, __u64 mask);
Those above 4 could probably be squashed into 2 syscalls with an extra
flags field.

int fanotify_clear_marks(int fanotify_fd);

int fanotify_perm_response(int fanotify_fd, __u64 cookie, int response);

int fanotify_ignore_sb(int fanotify_fd, long f_type);
int fanotify_ignore_fsid(int fanotify_fd, fsid_t f_fsid);
These 2 are the most questionable, they would honestly only be used for
things that wanted system wide notification, I can't imagine that being
many things other than AV vendors. But they really need a way to
exclude notification when people open/close/read/write to /proc (which
is the point of the ignore_sb.)

Since I don't have a solution to subtree notification I don't know if it
will work in this syscall framework. I know people want subtree
notification and I'm willing to take a stab at it after the fscking all
notification is accepted. That's one of the main reasons I like
setsockopt over tons of syscalls. I can add a new one very easily. I
also can easily expand arguments by just creating a new sockopt. No
userspace headaches.

Are there demands that I convert to syscalls? Do I really gain anything
using 9 new inextensible syscalls over socket(), bind(), and 8
setsockopt() calls?

I'd like to send these patches along, so a ruling from on high would be
great....

-Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/