Re: [RFC v3 1/4] fs: Add generic file system event notifications

From: Beata Michalska
Date: Wed Jun 24 2015 - 11:31:26 EST


On 06/24/2015 10:47 AM, Dmitry Monakhov wrote:
> Beata Michalska <b.michalska@xxxxxxxxxxx> writes:
>
>> Introduce configurable generic interface for file
>> system-wide event notifications, to provide file
>> systems with a common way of reporting any potential
>> issues as they emerge.
>>
>> The notifications are to be issued through generic
>> netlink interface by newly introduced multicast group.
>>
>> Threshold notifications have been included, allowing
>> triggering an event whenever the amount of free space drops
>> below a certain level - or levels to be more precise as two
>> of them are being supported: the lower and the upper range.
>> The notifications work both ways: once the threshold level
>> has been reached, an event shall be generated whenever
>> the number of available blocks goes up again re-activating
>> the threshold.
>>
>> The interface has been exposed through a vfs. Once mounted,
>> it serves as an entry point for the set-up where one can
>> register for particular file system events.
>>
>> Signed-off-by: Beata Michalska <b.michalska@xxxxxxxxxxx>
>> ---
>> Documentation/filesystems/events.txt | 232 ++++++++++
>> fs/Kconfig | 2 +
>> fs/Makefile | 1 +
>> fs/events/Kconfig | 7 +
>> fs/events/Makefile | 5 +
>> fs/events/fs_event.c | 809 ++++++++++++++++++++++++++++++++++
>> fs/events/fs_event.h | 22 +
>> fs/events/fs_event_netlink.c | 104 +++++
>> fs/namespace.c | 1 +
>> include/linux/fs.h | 6 +-
>> include/linux/fs_event.h | 72 +++
>> include/uapi/linux/Kbuild | 1 +
>> include/uapi/linux/fs_event.h | 58 +++
>> 13 files changed, 1319 insertions(+), 1 deletion(-)
>> create mode 100644 Documentation/filesystems/events.txt
>> create mode 100644 fs/events/Kconfig
>> create mode 100644 fs/events/Makefile
>> create mode 100644 fs/events/fs_event.c
>> create mode 100644 fs/events/fs_event.h
>> create mode 100644 fs/events/fs_event_netlink.c
>> create mode 100644 include/linux/fs_event.h
>> create mode 100644 include/uapi/linux/fs_event.h
>>
>> diff --git a/Documentation/filesystems/events.txt b/Documentation/filesystems/events.txt
>> new file mode 100644
>> index 0000000..c2e6227
>> --- /dev/null
>> +++ b/Documentation/filesystems/events.txt
>> @@ -0,0 +1,232 @@
>> +
>> + Generic file system event notification interface
>> +
>> +Document created 23 April 2015 by Beata Michalska <b.michalska@xxxxxxxxxxx>
>> +
>> +1. The reason behind:
>> +=====================
>> +
>> +There are many corner cases when things might get messy with the filesystems.
>> +And it is not always obvious what and when went wrong. Sometimes you might
>> +get some subtle hints that there is something going on - but by the time
>> +you realise it, it might be too late as you are already out-of-space
>> +or the filesystem has been remounted as read-only (i.e.). The generic
>> +interface for the filesystem events fills the gap by providing a rather
>> +easy way of real-time notifications triggered whenever something interesting
>> +happens, allowing filesystems to report events in a common way, as they occur.
>> +
>> +2. How does it work:
>> +====================
>> +
>> +The interface itself has been exposed as fstrace-type Virtual File System,
>> +primarily to ease the process of setting up the configuration for the
>> +notifications. So for starters, it needs to get mounted (obviously):
>> +
>> + mount -t fstrace none /sys/fs/events
>> +
>> +This will unveil the single fstrace filesystem entry - the 'config' file,
>> +through which the notification are being set-up.
>> +
>> +Activating notifications for particular filesystem is as straightforward
>> +as writing into the 'config' file. Note that by default all events, despite
>> +the actual filesystem type, are being disregarded.
>> +
>> +Synopsis of config:
>> +------------------
>> +
>> + MOUNT EVENT_TYPE [L1] [L2]
>> +
>> + MOUNT : the filesystem's mount point
>> + EVENT_TYPE : event types - currently two of them are being supported:
>> +
>> + * generic events ("G") covering most common warnings
>> + and errors that might be reported by any filesystem;
>> + this option does not take any arguments;
>> +
>> + * threshold notifications ("T") - events sent whenever
>> + the amount of available space drops below certain level;
>> + it is possible to specify two threshold levels though
>> + only one is required to properly setup the notifications;
>> + as those refer to the number of available blocks, the lower
>> + level [L1] needs to be higher than the upper one [L2]
>> +
>> +Sample request could look like the following:
>> +
>> + echo /sample/mount/point G T 710000 500000 > /sys/fs/events/config
>> +
>> +Multiple request might be specified provided they are separated with semicolon.
>> +
>> +The configuration itself might be modified at any time. One can add/remove
>> +particular event types for given fielsystem, modify the threshold levels,
>> +and remove single or all entries from the 'config' file.
>> +
>> + - Adding new event type:
>> +
>> + $ echo MOUNT EVENT_TYPE > /sys/fs/events/config
>> +
>> +(Note that is is enough to provide the event type to be enabled without
>> +the already set ones.)
>> +
>> + - Removing event type:
>> +
>> + $ echo '!MOUNT EVENT_TYPE' > /sys/fs/events/config
>> +
>> + - Updating threshold limits:
>> +
>> + $ echo MOUNT T L1 L2 > /sys/fs/events/config
>> +
>> + - Removing single entry:
>> +
>> + $ echo '!MOUNT' > /sys/fs/events/config
>> +
>> + - Removing all entries:
>> +
>> + $ echo > /sys/fs/events/config
>> +
>> +Reading the file will list all registered entries with their current set-up
>> +along with some additional info like the filesystem type and the backing device
>> +name if available.
>> +
>> +Final, though a very important note on the configuration: when and if the
>> +actual events are being triggered falls way beyond the scope of the generic
>> +filesystem events interface. It is up to a particular filesystem
>> +implementation which events are to be supported - if any at all. So if
>> +given filesystem does not support the event notifications, an attempt to
>> +enable those through 'config' file will fail.
>> +
>> +
>> +3. The generic netlink interface support:
>> +=========================================
>> +
>> +Whenever an event notification is triggered (by given filesystem) the current
>> +configuration is being validated to decide whether a userpsace notification
>> +should be launched. If there has been no request (in a mean of 'config' file
>> +entry) for given event, one will be silently disregarded. If, on the other
>> +hand, someone is 'watching' given filesystem for specific events, a generic
>> +netlink message will be sent. A dedicated multicast group has been provided
>> +solely for this purpose so in order to receive such notifications, one should
>> +subscribe to this new multicast group. As for now only the init network
>> +namespace is being supported.
>> +
>> +3.1 Message format
>> +
>> +The FS_NL_C_EVENT shall be stored within the generic netlink message header
>> +as the command field. The message payload will provide more detailed info:
>> +the backing device major and minor numbers, the event code and the id of
>> +the process which action led to the event occurrence. In case of threshold
>> +notifications, the current number of available blocks will be included
>> +in the payload as well.
>> +
>> +
>> + 0 1 2 3
>> + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>> + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> + | NETLINK MESSAGE HEADER |
>> + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> + | GENERIC NETLINK MESSAGE HEADER |
>> + | (with FS_NL_C_EVENT as genlmsghdr cdm field) |
>> + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> + | Optional user specific message header |
>> + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>> + | GENERIC MESSAGE PAYLOAD: |
>> + +---------------------------------------------------------------+
>> + | FS_NL_A_EVENT_ID (NLA_U32) |
>> + +---------------------------------------------------------------+
>> + | FS_NL_A_DEV_MAJOR (NLA_U32) |
>> + +---------------------------------------------------------------+
>> + | FS_NL_A_DEV_MINOR (NLA_U32) |
>
...

>> +
>> +static int create_common_msg(struct sk_buff *skb, void *data)
>> +{
>> + struct fs_trace_entry *en = (struct fs_trace_entry *)data;
>> + struct super_block *sb = en->sb;
>> +
>> + if (nla_put_u32(skb, FS_NL_A_DEV_MAJOR, MAJOR(sb->s_dev))
>> + || nla_put_u32(skb, FS_NL_A_DEV_MINOR, MINOR(sb->s_dev)))
>> + return -EINVAL;
> What about diskless(nfs,cifs,etc) filesystem? btrfs also has no
> valid sb->s_dev

Those are using the anon ids, generated by get_anon_bdev
(through set_anon_super). This id will be visible in
/proc/self/mountinfo or through stat. i.e:

30 22 0:21 / /root/fake_fs/btrfs rw,realtime - btrfs /dev/loop4 rw,nospace_cache


Best Regards
Beata


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/