Re: fanotify man pages for review
From: Michael Kerrisk (man-pages)
Date: Wed May 14 2014 - 06:37:26 EST
Eric,
Another ping for this request. Could you please take a look at the
fanotify man pages that Heinrich has written.
Cheers,
Michael
On Wed, May 7, 2014 at 9:15 PM, Michael Kerrisk (man-pages)
<mtk.manpages@xxxxxxxxx> wrote:
> Hello Eric,
>
> Ping for this request. Could you please take a look at these pages?
>
> Thanks,
>
> Michael
>
>
>
> On Tue, Apr 29, 2014 at 3:51 PM, Michael Kerrisk (man-pages)
> <mtk.manpages@xxxxxxxxx> wrote:
>> Hello Eric (and all),
>>
>> Heinrich Schuchardt has made a magnificent effort writing some man
>> pages that extensively document the fanotify API that you added in
>> Linux 2.6.36/37. Could I ask you (and anyone else who is interested)
>> to review them please for completeness and accuracy. I would
>> really like to get such a review before publishing the pages, in
>> order to minimize the chance of publishing errors
>>
>> The pages are:
>>
>> fanotify.7:
>> An overview of the fanotify API complete with an
>> example program, n
>>
>> fanotify_init.2
>> Description of the fanotify_init() system call
>>
>> fanotify_mark.2
>> Description of the fanotify_mark() system call
>>
>> Cheers,
>>
>> Michael
>>
>>
>>
>> diff --git a/man2/fanotify_init.2 b/man2/fanotify_init.2
>> new file mode 100644
>> index 0000000..e54fe7e
>> --- /dev/null
>> +++ b/man2/fanotify_init.2
>> @@ -0,0 +1,206 @@
>> +.\" Copyright (C) 2013, Heinrich Schuchardt <xypron.glpk@xxxxxx>
>> +.\"
>> +.\" %%%LICENSE_START(VERBATIM)
>> +.\" Permission is granted to make and distribute verbatim copies of this
>> +.\" manual provided the copyright notice and this permission notice are
>> +.\" preserved on all copies.
>> +.\"
>> +.\" Permission is granted to copy and distribute modified versions of
>> +.\" this manual under the conditions for verbatim copying, provided that
>> +.\" the entire resulting derived work is distributed under the terms of
>> +.\" a permission notice identical to this one.
>> +.\"
>> +.\" Since the Linux kernel and libraries are constantly changing, this
>> +.\" manual page may be incorrect or out-of-date. The author(s) assume.
>> +.\" no responsibility for errors or omissions, or for damages resulting.
>> +.\" from the use of the information contained herein. The author(s) may.
>> +.\" not have taken the same level of care in the production of this.
>> +.\" manual, which is licensed free of charge, as they might when working.
>> +.\" professionally.
>> +.\"
>> +.\" Formatted or processed versions of this manual, if unaccompanied by
>> +.\" the source, must acknowledge the copyright and authors of this work.
>> +.\" %%%LICENSE_END
>> +.TH FANOTIFY_INIT 2 2014-04-24 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +fanotify_init \- create and initialize fanotify group
>> +.SH SYNOPSIS
>> +.B #include <fcntl.h>
>> +.br
>> +.B #include <sys/fanotify.h>
>> +.sp
>> +.BI "int fanotify_init(unsigned int " flags ", unsigned int " event_f_flags );
>> +.SH DESCRIPTION
>> +For an overview of the fanotify API, see
>> +.BR fanotify (7).
>> +.PP
>> +.BR fanotify_init ()
>> +initializes a new fanotify group and returns a file descriptor for the event
>> +queue associated with the group.
>> +.PP
>> +The file descriptor is used in calls to
>> +.BR fanotify_mark (2)
>> +to specify the files, directories, and mounts for which fanotify events shall
>> +be created.
>> +These events are received by reading from the file descriptor.
>> +Some events are only informative, indicating that a file has been accessed.
>> +Other events can be used to determine whether
>> +another application is permitted to access a file or directory.
>> +Permission to access filesystem objects is granted by writing to the file
>> +descriptor.
>> +.PP
>> +Multiple programs may be using the fanotify interface at the same time to
>> +monitor the same files.
>> +.PP
>> +In the current implementation, the number of fanotify groups per user is
>> +limited to 128.
>> +This limit cannot be overridden.
>> +.PP
>> +Calling
>> +.BR fanotify_init ()
>> +requires the
>> +.B CAP_SYS_ADMIN
>> +capability.
>> +This constraint might be relaxed in future versions of the API.
>> +Therefore, certain additional capability checks have been implemented as
>> +indicated below.
>> +.PP
>> +The
>> +.I flags
>> +argument contains a multi-bit field defining the notification class of the
>> +listening application and further single bit fields specifying the behavior of
>> +the file descriptor.
>> +.PP
>> +If multiple listeners for permission events exist, the notification class is
>> +used to establish the sequence in which the listeners receive the events.
>> +.PP
>> +Only one of the following notification classes may be specified in
>> +.IR flags :
>> +.TP
>> +.B FAN_CLASS_PRE_CONTENT
>> +This value allows the receipt of events notifying that a file has been
>> +accessed and events for permission decisions if a file may be accessed.
>> +It is intended for event listeners that need to access files before they
>> +contain their final data.
>> +This notification class might be used by hierarchical storage managers, for
>> +example.
>> +.TP
>> +.B FAN_CLASS_CONTENT
>> +This value allows the receipt of events notifying that a file has been
>> +accessed and events for permission decisions if a file may be accessed.
>> +It is intended for event listeners that need to access files when they already
>> +contain their final content.
>> +This notification class might be used by malware detection programs, for
>> +example.
>> +.TP
>> +.B FAN_CLASS_NOTIF
>> +This is the default value.
>> +It does not need to be specified.
>> +This value only allows the receipt of events notifying that a file has been
>> +accessed.
>> +Permission decisions before the file is accessed are not possible.
>> +.PP
>> +Listeners with different notification classes will receive events in the
>> +order
>> +.BR FAN_CLASS_PRE_CONTENT ,
>> +.BR FAN_CLASS_CONTENT ,
>> +.BR FAN_CLASS_NOTIF .
>> +The order of notification for listeners of the same value is undefined.
>> +.PP
>> +The following bit mask values can be set additionally in
>> +.IR flags :
>> +.TP
>> +.B FAN_CLOEXEC
>> +This flag sets the close-on-exec flag
>> +.RB ( FD_CLOEXEC )
>> +on the new file descriptor.
>> +See the description of the
>> +.B O_CLOEXEC
>> +flag in
>> +.BR open (2).
>> +.TP
>> +.B FAN_NONBLOCK
>> +This flag enables the nonblocking flag
>> +.RB ( O_NONBLOCK )
>> +for the file descriptor.
>> +Reading from the file descriptor will not block.
>> +Instead, if no data is available,
>> +.BR read (2)
>> +will fail with the error
>> +.BR EAGAIN .
>> +.TP
>> +.B FAN_UNLIMITED_QUEUE
>> +This flag removes the limit of 16384 events for the event queue.
>> +It requires the
>> +.B CAP_SYS_ADMIN
>> +capability.
>> +.TP
>> +.B FAN_UNLIMITED_MARKS
>> +This flag removes the limit of 8192 marks.
>> +It requires the
>> +.B CAP_SYS_ADMIN
>> +capability.
>> +.PP
>> +The argument
>> +.I event_f_flags
>> +defines the file flags with which file descriptors for fanotify events shall
>> +be created.
>> +For explanations of possible values, see the argument
>> +.I flags
>> +of the
>> +.BR open (2)
>> +system call.
>> +Useful values are:
>> +.TP
>> +.B O_RDONLY
>> +This value allows only read access.
>> +.TP
>> +.B O_WRONLY
>> +This value allows only write access.
>> +.TP
>> +.B O_RDWR
>> +This value allows read and write access.
>> +.TP
>> +.B O_CLOEXEC
>> +This flag enables the close-on-exec flag for the file descriptor.
>> +.TP
>> +.B O_LARGEFILE
>> +This flag enables support for files exceeding 2 GB.
>> +Failing to set this flag will result in an
>> +.B EOVERFLOW
>> +error when trying to open a large file which is monitored by an fanotify group
>> +on a 32-bit system.
>> +.SH RETURN VALUE
>> +On success,
>> +.BR fanotify_init ()
>> +returns a new file descriptor.
>> +On error, \-1 is returned, and
>> +.I errno
>> +is set to indicate the error.
>> +.SH ERRORS
>> +.TP
>> +.B EINVAL
>> +An invalid value was passed in
>> +.IR flags .
>> +.B FAN_ALL_INIT_FLAGS
>> +defines all allowable bits.
>> +.TP
>> +.B EMFILE
>> +The number of fanotify groups of the user exceeds 128.
>> +.TP
>> +.B ENOMEM
>> +The allocation of memory for the notification group failed.
>> +.TP
>> +.B EPERM
>> +The operation is not permitted because the caller lacks the
>> +.B CAP_SYS_ADMIN
>> +capability.
>> +.SH VERSIONS
>> +.BR fanotify_init ()
>> +was introduced in version 2.6.36 of the Linux kernel and enabled in version
>> +2.6.37.
>> +.SH "CONFORMING TO"
>> +This system call is Linux-specific.
>> +.SH "SEE ALSO"
>> +.BR fanotify_mark (2),
>> +.BR fanotify (7)
>> diff --git a/man2/fanotify_mark.2 b/man2/fanotify_mark.2
>> new file mode 100644
>> index 0000000..693eff8
>> --- /dev/null
>> +++ b/man2/fanotify_mark.2
>> @@ -0,0 +1,327 @@
>> +.\" Copyright (C) 2013, Heinrich Schuchardt <xypron.glpk@xxxxxx>
>> +.\"
>> +.\" %%%LICENSE_START(VERBATIM)
>> +.\" Permission is granted to make and distribute verbatim copies of this
>> +.\" manual provided the copyright notice and this permission notice are
>> +.\" preserved on all copies.
>> +.\"
>> +.\" Permission is granted to copy and distribute modified versions of
>> +.\" this manual under the conditions for verbatim copying, provided that
>> +.\" the entire resulting derived work is distributed under the terms of
>> +.\" a permission notice identical to this one.
>> +.\"
>> +.\" Since the Linux kernel and libraries are constantly changing, this
>> +.\" manual page may be incorrect or out-of-date. The author(s) assume.
>> +.\" no responsibility for errors or omissions, or for damages resulting.
>> +.\" from the use of the information contained herein. The author(s) may.
>> +.\" not have taken the same level of care in the production of this.
>> +.\" manual, which is licensed free of charge, as they might when working.
>> +.\" professionally.
>> +.\"
>> +.\" Formatted or processed versions of this manual, if unaccompanied by
>> +.\" the source, must acknowledge the copyright and authors of this work.
>> +.\" %%%LICENSE_END
>> +.TH FANOTIFY_MARK 2 2014-04-24 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +fanotify_mark \- add, remove, or modify an fanotify mark on a filesystem
>> +object
>> +.SH SYNOPSIS
>> +.nf
>> +.B #include <sys/fanotify.h>
>> +.sp
>> +.BI "int fanotify_mark(int " fanotify_fd ", unsigned int " flags ,
>> +.BI " uint64_t " mask ", int " dirfd \
>> +", const char *" pathname );
>> +.fi
>> +.SH DESCRIPTION
>> +For an overview of the fanotify API, see
>> +.BR fanotify (7).
>> +.PP
>> +.BR fanotify_mark (2)
>> +adds, removes, or modifies an fanotify mark on a filesystem object.
>> +The caller must have read permission on the filesystem object that is to be
>> +marked.
>> +.PP
>> +The
>> +.I fanotify_fd
>> +argument is a file descriptor returned by
>> +.BR fanotify_init (2).
>> +.PP
>> +.I flags
>> +is a bit mask describing the modification to perform.
>> +It must include exactly one of the following values:
>> +.TP
>> +.B FAN_MARK_ADD
>> +The events in
>> +.I mask
>> +will be added to the mark mask (or to the ignore mask).
>> +.I mask
>> +must be nonempty or the error
>> +.B EINVAL
>> +will occur.
>> +.TP
>> +.B FAN_MARK_REMOVE
>> +The events in argument
>> +.I mask
>> +will be removed from the mark mask (or from the ignore mask).
>> +.I mask
>> +must be nonempty or the error
>> +.B EINVAL
>> +will occur.
>> +.TP
>> +.B FAN_MARK_FLUSH
>> +Remove either all mount or non-mount marks from the fanotify group.
>> +If
>> +.I flag
>> +contains
>> +.BR FAN_MARK_MOUNT ,
>> +all marks for mounts are removed from the group.
>> +Otherwise, all marks for directories and files are removed.
>> +No flag other than
>> +.B FAN_MARK_MOUNT
>> +can be used in conjunction with
>> +.BR FAN_MARK_FLUSH .
>> +.I mask
>> +is ignored.
>> +.PP
>> +If none of the values above is specified, or more than one is specified, the
>> +call fails with the error
>> +.BR EINVAL .
>> +.PP
>> +In addition,
>> +.I flags
>> +may contain zero or more of the following:
>> +.TP
>> +.B FAN_MARK_DONT_FOLLOW
>> +If
>> +.I pathname
>> +is a symbolic link, mark the link itself, rather than the file to which it
>> +refers.
>> +(By default,
>> +.BR fanotify_mark ()
>> +dereferences
>> +.I pathname
>> +if it is a symbolic link.)
>> +.TP
>> +.B FAN_MARK_ONLYDIR
>> +If the filesystem object to be marked is not a directory, the error
>> +.B ENOTDIR
>> +shall be raised.
>> +.TP
>> +.B FAN_MARK_MOUNT
>> +Mark the mount point specified by
>> +.IR pathname .
>> +If
>> +.I pathname
>> +is not itself a mount point, the mount point containing
>> +.I pathname
>> +will be marked.
>> +All directories, subdirectories, and the contained files of the mount point
>> +will be monitored.
>> +.TP
>> +.B FAN_MARK_IGNORED_MASK
>> +The events in
>> +.I mask
>> +shall be added to or removed from the ignore mask.
>> +.TP
>> +.B FAN_MARK_IGNORED_SURV_MODIFY
>> +The ignore mask shall survive modify events.
>> +If this flag is not set, the ignore mask is cleared when a modify event occurs
>> +for the ignored file or directory.
>> +.PP
>> +.I mask
>> +defines which events shall be listened to (or which shall be ignored).
>> +It is a bit mask composed of the following values:
>> +.TP
>> +.B FAN_ACCESS
>> +Create an event when a file or directory (but see BUGS) is accessed (read).
>> +.TP
>> +.B FAN_MODIFY
>> +Create an event when a file is modified (write).
>> +.TP
>> +.B FAN_CLOSE_WRITE
>> +Create an event when a writable file is closed.
>> +.TP
>> +.B FAN_CLOSE_NOWRITE
>> +Create an event when a read-only file or directory is closed.
>> +.TP
>> +.B FAN_OPEN
>> +Create an event when a file or directory is opened.
>> +.TP
>> +.B FAN_OPEN_PERM
>> +Create an event when a permission to open a file or directory is requested.
>> +An fanotify file descriptor created with
>> +.B FAN_CLASS_PRE_CONTENT
>> +or
>> +.B FAN_CLASS_CONTENT
>> +is required.
>> +.TP
>> +.B FAN_ACCESS_PERM
>> +Create an event when a permission to read a file or directory is requested.
>> +An fanotify file descriptor created with
>> +.B FAN_CLASS_PRE_CONTENT
>> +or
>> +.B FAN_CLASS_CONTENT
>> +is required.
>> +.TP
>> +.B FAN_ONDIR
>> +Events for directories shall be created, for example when
>> +.BR opendir (2),
>> +.BR readdir (2)
>> +(but see BUGS), and
>> +.BR closedir (2)
>> +are called.
>> +Without this flag, only events for files are created.
>> +.TP
>> +.B FAN_EVENT_ON_CHILD
>> +Events for the immediate children of marked directories shall be created.
>> +The flag has no effect when marking mounts.
>> +Note that events are not generated for children of the subdirectories
>> +of marked directories.
>> +To monitor complete directory trees it is necessary to mark the relevant
>> +mount.
>> +.PP
>> +The following composed value is defined:
>> +.TP
>> +.B FAN_CLOSE
>> +A file is closed
>> +.RB ( FAN_CLOSE_WRITE | FAN_CLOSE_NOWRITE ).
>> +.PP
>> +The filesystem object to be marked is determined by the file descriptor
>> +.I dirfd
>> +and the pathname specified in
>> +.IR pathname :
>> +.IP * 3
>> +If
>> +.I pathname
>> +is NULL,
>> +.I dirfd
>> +defines the filesystem object to be marked.
>> +.IP *
>> +If
>> +.I pathname
>> +is NULL, and
>> +.I dirfd
>> +takes the special value
>> +.BR AT_FDCWD ,
>> +the current working directory is to be marked.
>> +.IP *
>> +If
>> +.I pathname
>> +is absolute, it defines the filesystem object to be marked, and
>> +.I dirfd
>> +is ignored.
>> +.IP *
>> +If
>> +.I pathname
>> +is relative, and
>> +.I dirfd
>> +does not have the value
>> +.BR AT_FDCWD ,
>> +then the filesystem object to be marked is determined by interpreting
>> +.I pathname
>> +relative the directory referred to by
>> +.IR dirfd .
>> +.IP *
>> +If
>> +.I pathname
>> +is relative, and
>> +.I dirfd
>> +has the value
>> +.BR AT_FDCWD,
>> +then the filesystem object to be marked is determined by interpreting
>> +.I pathname
>> +relative the current working directory.
>> +.SH RETURN VALUE
>> +On success,
>> +.BR fanotify_mark ()
>> +returns 0.
>> +On error, \-1 is returned, and
>> +.I errno
>> +is set to indicate the error.
>> +.SH ERRORS
>> +.TP
>> +.B EBADF
>> +An invalid file descriptor was passed in
>> +.IR fanotify_fd .
>> +.TP
>> +.B EINVAL
>> +An invalid value was passed in
>> +.IR flags
>> +or
>> +.IR mask ,
>> +or
>> +.I fanotify_fd
>> +was not an fanotify file descriptor.
>> +.TP
>> +.B EINVAL
>> +The fanotify file descriptor was opened with
>> +.B FAN_CLASS_NOTIF
>> +and mask contains a flag for permission events
>> +.RB ( FAN_OPEN_PERM
>> +or
>> +.BR FAN_ACCESS_PERM ).
>> +.TP
>> +.B ENOENT
>> +The filesystem object indicated by
>> +.IR dirfd
>> +and
>> +.IR pathname
>> +does not exist.
>> +This error also occurs when trying to remove a mark from an object which is not
>> +marked.
>> +.TP
>> +.B ENOMEM
>> +The necessary memory could not be allocated.
>> +.TP
>> +.B ENOSPC
>> +The number of marks exceeds the limit of 8192 and
>> +.B FAN_UNLIMITED_MARKS
>> +was not specified in the call to
>> +.BR fanotify_init (2).
>> +.TP
>> +.B ENOTDIR
>> +.I flags
>> +contains
>> +.BR FAN_MARK_ONLYDIR ,
>> +and
>> +.I dirfd
>> +and
>> +.I pathname
>> +do not specify a directory.
>> +.SH VERSIONS
>> +.BR fanotify_mark ()
>> +was introduced in version 2.6.36 of the Linux kernel and enabled in version
>> +2.6.37.
>> +.SH CONFORMING TO
>> +This system call is Linux-specific.
>> +.SH BUGS
>> +As of Linux 3.15,
>> +the following bugs exist:
>> +.IP * 3
>> +.\" FIXME: Patch is in next-20140424.
>> +If
>> +.I flags
>> +contains
>> +.BR FAN_MARK_FLUSH ,
>> +.I dfd
>> +and
>> +.I pathname
>> +must indicate a filesystem object, even though this object is not used.
>> +.IP *
>> +.\" FIXME: Patch is in next-20140424.
>> +.BR readdir (2)
>> +does not result in a
>> +.B FAN_ACCESS
>> +event.
>> +.IP *
>> +.\" FIXME: Patch proposed.
>> +If
>> +.BR fanotify_mark (2)
>> +is called with
>> +.B FAN_MARK_FLUSH,
>> +.I flags
>> +is not checked for invalid values.
>> +.SH SEE ALSO
>> +.BR fanotify_init (2),
>> +.BR fanotify (7)
>> diff --git a/man7/fanotify.7 b/man7/fanotify.7
>> new file mode 100644
>> index 0000000..083244f
>> --- /dev/null
>> +++ b/man7/fanotify.7
>> @@ -0,0 +1,684 @@
>> +.\" Copyright (C) 2013, Heinrich Schuchardt <xypron.glpk@xxxxxx>
>> +.\"
>> +.\" %%%LICENSE_START(VERBATIM)
>> +.\" Permission is granted to make and distribute verbatim copies of this
>> +.\" manual provided the copyright notice and this permission notice are
>> +.\" preserved on all copies.
>> +.\"
>> +.\" Permission is granted to copy and distribute modified versions of
>> +.\" this manual under the conditions for verbatim copying, provided that
>> +.\" the entire resulting derived work is distributed under the terms of
>> +.\" a permission notice identical to this one.
>> +.\"
>> +.\" Since the Linux kernel and libraries are constantly changing, this
>> +.\" manual page may be incorrect or out-of-date. The author(s) assume.
>> +.\" no responsibility for errors or omissions, or for damages resulting.
>> +.\" from the use of the information contained herein. The author(s) may.
>> +.\" not have taken the same level of care in the production of this.
>> +.\" manual, which is licensed free of charge, as they might when working.
>> +.\" professionally.
>> +.\"
>> +.\" Formatted or processed versions of this manual, if unaccompanied by
>> +.\" the source, must acknowledge the copyright and authors of this work.
>> +.\" %%%LICENSE_END
>> +.TH FANOTIFY 7 2014-04-24 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +fanotify \- monitoring filesystem events
>> +.SH DESCRIPTION
>> +The fanotify API provides notification and interception of filesystem events.
>> +Use cases include virus scanning and hierarchical storage management.
>> +Currently, only a limited set of events is supported.
>> +In particular, there is no support for create, delete, and move events.
>> +(See
>> +.BR inotify (7)
>> +for details of an API that does notify those events.)
>> +
>> +Additional capabilities compared to the
>> +.BR inotify (7)
>> +API are monitoring of complete mounts, access permission decisions, and the
>> +possibility to read or modify files before access by other applications.
>> +
>> +The following system calls are used with this API:
>> +.BR fanotify_init (2),
>> +.BR fanotify_mark (2),
>> +.BR read (2),
>> +.BR write (2),
>> +and
>> +.BR close (2).
>> +.SS fanotify_init(), fanotify_mark(), and notification groups
>> +The
>> +.BR fanotify_init (2)
>> +system call creates and initializes an fanotify notification group
>> +and returns a file descriptor referring to it.
>> +.PP
>> +An fanotify notification group is a kernel-internal object that holds
>> +a list of files, directories, and mount points for which events shall be
>> +created.
>> +.PP
>> +For each entry in an fanotify notification group, two bit masks exist: the
>> +.I mark
>> +mask and the
>> +.I ignore
>> +mask.
>> +The mark mask defines file activities for which an event shall be created.
>> +The ignore mask defines activities for which no event shall be generated.
>> +Having these two types of masks permits a mount point or directory to be
>> +marked for receiving events, while at the same time ignoring events for
>> +specific objects under that mount point or directory.
>> +.PP
>> +The
>> +.BR fanotify_mark (2)
>> +system call adds a file, directory, or mount to a notification group
>> +and specifies which events
>> +shall be reported (or ignored), or removes or modifies such an entry.
>> +.PP
>> +A possible usage of the ignore mask is for a file cache.
>> +Events of interest for a file cache are modification of a file and closing
>> +of the same.
>> +Hence, the cached directory or mount point is to be marked to receive these
>> +events.
>> +After receiving the first event informing that a file has been modified, the
>> +corresponding cache entry will be invalidated.
>> +No further modification events for this file are of interest until the file is
>> +closed.
>> +Hence, the modify event can be added to the ignore mask.
>> +Upon receiving the closed event, the modify event can be removed from the
>> +ignore mask and the file cache entry can be updated.
>> +.PP
>> +The entries in the fanotify notification groups refer to files and directories
>> +via their inode number and to mounts via their mount ID.
>> +If files or directories are renamed or moved, the respective entries survive.
>> +If files or directories are deleted or mounts are unmounted, the corresponding
>> +entries are deleted.
>> +.SS The event queue
>> +As events occur on the filesystem objects monitired by a notification group,
>> +the fanotify system generates events that are collected in a queue.
>> +These events can then be read (using
>> +.BR read (2)
>> +or similar)
>> +from the fanotify file descriptor
>> +returned by
>> +.BR fanotify_init (2).
>> +
>> +Two types of events are generated:
>> +notification events and permission events.
>> +Notification events are merely informative
>> +and require no action to be taken by
>> +the receiving application except for closing the file descriptor passed in the
>> +event.
>> +Permission events are requests to the receiving application to decide whether
>> +permission for a file access shall be granted.
>> +For these events, the recipient must write a response which decides whether
>> +access is granted or not.
>> +
>> +Queue entries for notification events are removed when the event has been
>> +read.
>> +Queue entries for permission events are removed when the permission
>> +decision has been taken by writing to the fanotify file descriptor.
>> +.SS Reading fanotify events
>> +Calling
>> +.BR read (2)
>> +for the file descriptor returned by
>> +.BR fanotify_init (2)
>> +blocks (if the flag
>> +.B FAN_NONBLOCK
>> +is not specified in the call to
>> +.BR fanotify_init (2))
>> +until either a file event occurs or the call is interrupted by a signal
>> +(see
>> +.BR signal (7)).
>> +
>> +The return value of
>> +.BR read (2)
>> +is the length of the filled buffer, or \-1 in case of an error.
>> +After a successful
>> +.BR read (2),
>> +the read buffer contains one or more of the following structures:
>> +
>> +.in +4n
>> +.nf
>> +struct fanotify_event_metadata {
>> + __u32 event_len;
>> + __u8 vers;
>> + __u8 reserved;
>> + __u16 metadata_len;
>> + __aligned_u64 mask;
>> + __s32 fd;
>> + __s32 pid;
>> +};
>> +.fi
>> +.in
>> +.PP
>> +The fields of this structure as follows:
>> +.TP
>> +.I event_len
>> +This is the length of the data for the current event and the offset to the next
>> +event in the buffer.
>> +In the current implementation, the value of
>> +.I event_len
>> +is always
>> +.BR FAN_EVENT_METADATA_LEN .
>> +In principle, the API design would allow to return variable-length structures.
>> +Therefore, and for performance reasons, it is recommended to use a larger
>> +buffer size when reading, for example 4096 bytes.
>> +.TP
>> +.I vers
>> +This field holds a version number for the structure.
>> +It must be compared to
>> +.B FANOTIFY_METADATA_VERSION
>> +to verify that the structures returned at runtime match
>> +the structures defined at compile time.
>> +In case of a mismatch, the application should abandon trying to use the
>> +fanotify file descriptor.
>> +.TP
>> +.I reserved
>> +This field is not used.
>> +.TP
>> +.I metadata_len
>> +This is the length of the structure.
>> +The field was introduced to facilitate the implementation of optional headers
>> +per event type.
>> +No such optional headers exist in the current implementation.
>> +.TP
>> +.I mask
>> +This is a bit mask describing the event.
>> +.TP
>> +.I fd
>> +This is an open file descriptor for the object being accessed, or
>> +.B FAN_NOFD
>> +if a queue overflow occurred.
>> +The file descriptor can be used to access the contents of the monitored file or
>> +directory.
>> +The
>> +.B FMODE_NONOTIFY
>> +file status flag is set on the corresponding open file description.
>> +This flag suppresses fanotify event generation.
>> +Hence, when the receiver of the fanotify event accesses the notified file or
>> +directory using this file descriptor, no additional events will be created.
>> +The reading application is responsible for closing the file descriptor.
>> +.TP
>> +.I pid
>> +This is the ID of the process that caused the event.
>> +A program listening to fanotify events can compare this PID to the PID returned
>> +by
>> +.BR getpid (2),
>> +to determine whether the event is caused by the listener itself, or is due to a
>> +file access by another program.
>> +.PP
>> +The bit mask in
>> +.I mask
>> +signals which events have occurred for a single filesystem object.
>> +Multiple bits may be set in this mask,
>> +if more than one event occurred for the monitored filesystem obect.
>> +In particular,
>> +consecutive events for the same filesystem object and originating from the
>> +same process may be merged into a single event, with the exception that two
>> +permission events are never merged into one queue entry.
>> +.PP
>> +The bits that may appear in
>> +.I mask
>> +are as follows:
>> +.TP
>> +.B FAN_ACCESS
>> +A file or a directory (but see BUGS) was accessed (read).
>> +.TP
>> +.B FAN_OPEN
>> +A file or a directory was opened.
>> +.TP
>> +.B FAN_MODIFY
>> +A file was modified.
>> +.TP
>> +.B FAN_CLOSE_WRITE
>> +A file that was opened for writing
>> +.RB ( O_WRONLY
>> +or
>> +.BR O_RDWR )
>> +was closed.
>> +.TP
>> +.B FAN_CLOSE_NOWRITE
>> +A file or directory that was opened read-only
>> +.RB ( O_RDONLY )
>> +was closed.
>> +.TP
>> +.B FAN_Q_OVERFLOW
>> +The event queue exceeded the limit of 16384 entries.
>> +This limit can be overridden in the call to
>> +.BR fanotify_init (2)
>> +by setting the flag
>> +.BR FAN_UNLIMITED_QUEUE .
>> +.TP
>> +.B FAN_ACCESS_PERM
>> +An application wants to read a file or directory, for example using
>> +.BR read (2)
>> +or
>> +.BR readdir (2).
>> +The reader must write a response that determines whether the permission to
>> +access the filesystem object shall be granted.
>> +.TP
>> +.B FAN_OPEN_PERM
>> +An application wants to open a file or directory.
>> +The reader must write a response that determines whether the permission to
>> +open the filesystem object shall be granted.
>> +.PP
>> +To check for any close event, the following bit mask may be used:
>> +.TP
>> +.B FAN_CLOSE
>> +A file was closed.
>> +This is a synonym for;
>> +
>> + FAN_CLOSE_WRITE | FAN_CLOSE_NOWRITE
>> +.PP
>> +The following macros are provided to iterate over a buffer containing fanotify
>> +event metadata returned by a
>> +.BR read (2)
>> +from an fanotify file descriptor.
>> +.TP
>> +.B FAN_EVENT_OK(meta, len)
>> +This macro checks the remaining length
>> +.I len
>> +of the buffer
>> +.I meta
>> +against the length of the metadata structure and the
>> +.I event_len
>> +field of the first metadata structure in the buffer.
>> +.TP
>> +.B FAN_EVENT_NEXT(meta, len)
>> +This macro sets the pointer
>> +.I meta
>> +to the next metadata structure using the length indicated in the
>> +.I event_len
>> +field of the metadata structure and reduces the remaining length of the
>> +buffer
>> +.IR len .
>> +.SS Monitoring an fanotify file descriptor for events
>> +When an fanotify event occurs, the fanotify file descriptor indicates as
>> +readable when passed to
>> +.BR epoll (7),
>> +.BR poll (2),
>> +or
>> +.BR select (2).
>> +.SS Dealing with permission events
>> +For permission events, the application must
>> +.BR write (2)
>> +a structure of the following form to the
>> +fanotify file descriptor:
>> +
>> +.in +4n
>> +.nf
>> +struct fanotify_response {
>> + __s32 fd;
>> + __u32 response;
>> +};
>> +.fi
>> +.in
>> +.PP
>> +The fields of this structure are as follows:
>> +.TP
>> +.I fd
>> +This is the file descriptor from the structure
>> +.IR fanotify_event_metadata .
>> +.TP
>> +.I response
>> +This field indicates whether or not the permission is to be granted.
>> +Its value must be either
>> +.B FAN_ALLOW
>> +to allow the file operation or
>> +.B FAN_DENY
>> +to deny the file operation.
>> +.PP
>> +If access is denied, the requesting application call will receive an
>> +.BR EPERM
>> +error.
>> +.SS Closing the fanotify file descriptor
>> +.PP
>> +When all file descriptors referring to the fanotify notification group are
>> +closed, the fanotify group is released and its resources
>> +are freed for reuse by the kernel.
>> +Upon
>> +.BR close (2),
>> +outstanding permission events will be set to allowed.
>> +.SS /proc/[pid]/fdinfo
>> +The file
>> +.I /proc/[pid]/fdinfo/[fd]
>> +contains information about fanotify marks for file descriptor
>> +.I fd
>> +of process
>> +.IR pid .
>> +See the kernel source file
>> +.I Documentation/filesystems/proc.txt
>> +for details.
>> +.SH ERRORS
>> +In addition to the usual errors for
>> +.BR read (2),
>> +the following errors can occur when reading from the fanotify file descriptor:
>> +.TP
>> +.B EINVAL
>> +The buffer is too short to hold the event.
>> +.TP
>> +.B EMFILE
>> +The per-process limit on the number of open files has been reached.
>> +See the description of
>> +.B RLIMIT_NOFILE
>> +in
>> +.BR getrlimit (2).
>> +.TP
>> +.B ENFILE
>> +The system-wide limit on the number of open files has been reached.
>> +See
>> +.I /proc/sys/fs/file-max
>> +in
>> +.BR proc (5).
>> +.TP
>> +.B ETXTBSY
>> +This error is returned by
>> +.BR read (2)
>> +if
>> +.B O_RDWR
>> +or
>> +.B O_WRONLY
>> +was specified in the
>> +.I event_f_flags
>> +argument when calling
>> +.BR fanotify_init (2)
>> +and an event occurred for a monitored file that is currently being executed.
>> +.PP
>> +In addition to the usual errors for
>> +.BR write (2),
>> +the following errors can occur when writing to the fanotify file descriptor:
>> +.TP
>> +.B EINVAL
>> +Fanotify access permissions are not enabled in the kernel configuration or the
>> +value of
>> +.I response
>> +in the response structure is not valid.
>> +.TP
>> +.B ENOENT
>> +The file descriptor
>> +.I fd
>> +in the response structure is not valid.
>> +This might occur because the file was already deleted by another thread or
>> +process.
>> +.SH VERSIONS
>> +The fanotify API was introduced in version 2.6.36 of the Linux kernel and
>> +enabled in version 2.6.37.
>> +Fdinfo support was added in version 3.8.
>> +.SH "CONFORMING TO"
>> +The fanotify API is Linux-specific.
>> +.SH NOTES
>> +The fanotify API is available only if the kernel was built with the
>> +.B CONFIG_FANOTIFY
>> +configuration option enabled.
>> +In addition, fanotify permission handling is available only if the
>> +.B CONFIG_FANOTIFY_ACCESS_PERMISSIONS
>> +configuration option is enabled.
>> +.SS Limitations and caveats
>> +Fanotify reports only events that a user-space program triggers through the
>> +filesystem API.
>> +As a result, it does not catch remote events that occur on network filesystems.
>> +.PP
>> +The fanotify API does not report file accesses and modifications that
>> +may occur because of
>> +.BR mmap (2),
>> +.BR msync (2),
>> +and
>> +.BR munmap (2).
>> +.PP
>> +Events for directories are created only if the directory itself is opened,
>> +read, and closed.
>> +Adding, removing, or changing children of a marked directory does not create
>> +events for the monitored directory itself.
>> +.PP
>> +Fanotify monitoring of directories is not recursive: to monitor subdirectories
>> +under a directory, additional marks must be created.
>> +(But note that the fanotify API provides no way of detecting when a
>> +subdirectory has been created under a marked directory, which makes recursive
>> +monitoring difficult.)
>> +Monitoring mounts offers the capability to monitor a whole directory tree.
>> +.PP
>> +The event queue can overflow.
>> +In this case, events are lost.
>> +.SH BUGS
>> +As of Linux 3.15,
>> +the following bug exists:
>> +.IP * 3
>> +.\" FIXME: A patch was proposed.
>> +When an event is generated, no check is made to see whether the user ID of the
>> +receiving process has authorization to read or write the file before passing a
>> +file descriptor for that file.
>> +This poses a security risk, when the
>> +.B CAP_SYS_ADMIN
>> +capability is set for programs executed by unprivileged users.
>> +.SH EXAMPLE
>> +The following program demonstrates the usage of the fanotify API.
>> +It marks the mount point passed as command-line argument
>> +and waits for events of type
>> +.B FAN_PERM_OPEN
>> +and
>> +.BR FAN_CLOSE_WRITE .
>> +When a permission event occurs, a
>> +.B FAN_ALLOW
>> +response is given.
>> +.PP
>> +The following output was recorded while editing the file
>> +.IR /home/user/temp/notes .
>> +Before the file was opened, a
>> +.B FAN_OPEN_PERM
>> +event occurred.
>> +After the file was closed, a
>> +.B FAN_CLOSE_WRITE
>> +event occurred.
>> +Execution of the program ends when the user presses the ENTER key.
>> +.SS Example output
>> +.in +4n
>> +.nf
>> +# ./fanotify_example /home
>> +Press enter key to terminate.
>> +Listening for events.
>> +FAN_OPEN_PERM: File /home/user/temp/notes
>> +FAN_CLOSE_WRITE: File /home/user/temp/notes
>> +
>> +Listening for events stopped.
>> +.fi
>> +.in
>> +.SS Program source
>> +.nf
>> +#define _GNU_SOURCE /* Needed to get O_LARGEFILE definition */
>> +#include <errno.h>
>> +#include <fcntl.h>
>> +#include <limits.h>
>> +#include <poll.h>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <sys/fanotify.h>
>> +#include <unistd.h>
>> +
>> +/* Read all available fanotify events from the file descriptor 'fd' */
>> +
>> +static void
>> +handle_events(int fd)
>> +{
>> + const struct fanotify_event_metadata *metadata;
>> + char buf[4096];
>> + ssize_t len;
>> + char path[PATH_MAX];
>> + ssize_t path_len;
>> + char procfd_path[PATH_MAX];
>> + struct fanotify_response response;
>> +
>> + /* Loop while events can be read from fanotify file descriptor */
>> +
>> + for(;;) {
>> +
>> + /* Read some events */
>> +
>> + len = read(fd, (void *) &buf, sizeof(buf));
>> + if (len == \-1 && errno != EAGAIN) {
>> + perror("read");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + /* Check if end of available data reached */
>> +
>> + if (len <= 0)
>> + break;
>> +
>> + /* Point to the first event in the buffer */
>> +
>> + metadata = (struct fanotify_event_metadata *) buf;
>> +
>> + /* Loop over all events in the buffer */
>> +
>> + while (FAN_EVENT_OK(metadata, len)) {
>> +
>> + /* Check that run\-time and compile\-time structures match */
>> +
>> + if (metadata\->vers != FANOTIFY_METADATA_VERSION) {
>> + fprintf(stderr,
>> + "Mismatch of fanotify metadata version.\\n");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + /* metadata\->fd contains either FAN_NOFD, indicating a
>> + queue overflow, or a file descriptor (a nonnegative
>> + integer). Here, we simply ignore queue overflow. */
>> +
>> + if (metadata\->fd >= 0) {
>> +
>> + /* Handle open permission event */
>> +
>> + if (metadata\->mask & FAN_OPEN_PERM) {
>> + printf("FAN_OPEN_PERM: ");
>> +
>> + /* Allow file to be opened */
>> +
>> + response.fd = metadata\->fd;
>> + response.response = FAN_ALLOW;
>> + write(fd, &response,
>> + sizeof(struct fanotify_response));
>> + }
>> +
>> + /* Handle closing of writable file event */
>> +
>> + if (metadata\->mask & FAN_CLOSE_WRITE)
>> + printf("FAN_CLOSE_WRITE: ");
>> +
>> + /* Retrieve and print pathname of the accessed file */
>> +
>> + snprintf(procfd_path, sizeof(procfd_path),
>> + "/proc/self/fd/%d", metadata\->fd);
>> + path_len = readlink(procfd_path, path,
>> + sizeof(path) \- 1);
>> + if (path_len == \-1) {
>> + perror("readlink");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + path[path_len] = '\\0';
>> + printf("File %s\\n", path);
>> +
>> + /* Close the file descriptor of the event */
>> +
>> + close(metadata\->fd);
>> + }
>> +
>> + /* Advance to next event */
>> +
>> + metadata = FAN_EVENT_NEXT(metadata, len);
>> + }
>> + }
>> +}
>> +
>> +int
>> +main(int argc, char *argv[])
>> +{
>> + char buf;
>> + int fd, poll_num;
>> + nfds_t nfds;
>> + struct pollfd fds[2];
>> +
>> + /* Check mount point is supplied */
>> +
>> + if (argc != 2) {
>> + fprintf(stderr, "Usage: %s MOUNT\\n", argv[0]);
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + printf("Press enter key to terminate.\\n");
>> +
>> + /* Create the file descriptor for accessing the fanotify API */
>> +
>> + fd = fanotify_init(FAN_CLOEXEC | FAN_CLASS_CONTENT | FAN_NONBLOCK,
>> + O_RDONLY | O_LARGEFILE);
>> + if (fd == \-1) {
>> + perror("fanotify_init");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + /* Mark the mount for:
>> + \- permission events before opening files
>> + \- notification events after closing a write\-enabled
>> + file descriptor */
>> +
>> + if (fanotify_mark(fd, FAN_MARK_ADD | FAN_MARK_MOUNT,
>> + FAN_OPEN_PERM | FAN_CLOSE_WRITE, \-1,
>> + argv[1]) == \-1) {
>> + perror("fanotify_mark");
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + /* Prepare for polling */
>> +
>> + nfds = 2;
>> +
>> + /* Console input */
>> +
>> + fds[0].fd = STDIN_FILENO;
>> + fds[0].events = POLLIN;
>> +
>> + /* Fanotify input */
>> +
>> + fds[1].fd = fd;
>> + fds[1].events = POLLIN;
>> +
>> + /* This is the loop to wait for incoming events */
>> +
>> + printf("Listening for events.\\n");
>> +
>> + while (1) {
>> + poll_num = poll(fds, nfds, \-1);
>> + if (poll_num == \-1) {
>> + if (errno == EINTR) /* Interrupted by a signal */
>> + continue; /* Restart poll() */
>> +
>> + perror("poll"); /* Unexpected error */
>> + exit(EXIT_FAILURE);
>> + }
>> +
>> + if (poll_num > 0) {
>> + if (fds[0].revents & POLLIN) {
>> +
>> + /* Console input is available: empty stdin and quit */
>> +
>> + while (read(STDIN_FILENO, &buf, 1) > 0 && buf != '\\n')
>> + continue;
>> + break;
>> + }
>> +
>> + if (fds[1].revents & POLLIN) {
>> +
>> + /* Fanotify events are available */
>> +
>> + handle_events(fd);
>> + }
>> + }
>> + }
>> +
>> + printf("Listening for events stopped.\\n");
>> + exit(EXIT_SUCCESS);
>> +}
>> +.fi
>> +.SH "SEE ALSO"
>> +.ad l
>> +.BR fanotify_init (2),
>> +.BR fanotify_mark (2),
>> +.BR inotify (7)
>>
>> --
>> Michael Kerrisk
>> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>> Linux/UNIX System Programming Training: http://man7.org/training/
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/