[GIT PULL] Mount notifications

From: David Howells
Date: Mon Aug 03 2020 - 11:28:00 EST



Hi Linus,

Here's a set of patches to add notifications for mount topology events,
such as mounting, unmounting, mount expiry, mount reconfiguration.

The first patch in the series adds a hard limit on the number of watches
that any particular user can add. The RLIMIT_NOFILE value for the process
adding a watch is used as the limit. Even if you don't take the rest of
the series, can you at least take this one?

An LSM hook is included for an LSM to rule on whether or not a mount watch
may be set on a particular path.

This series is intended to be taken in conjunction with the fsinfo series
which I'll post a pull request for shortly and which is dependent on it.

Karel Zak[*] has created preliminary patches that add support to libmount
and Ian Kent has started working on making systemd use them.

[*] https://github.com/karelzak/util-linux/commits/topic/fsinfo

Note that there have been some last minute changes to the patchset: you
wanted something adding and Miklós wanted some bits taking out/changing.
I've placed a tag, fsinfo-core-20200724 on the aggregate of these two
patchsets that can be compared to fsinfo-core-20200803.

To summarise the changes: I added the limiter that you wanted; removed an
unused symbol; made the mount ID fields in the notificaion 64-bit (the
fsinfo patchset has a change to convey the mount uniquifier instead of the
mount ID); removed the event counters from the mount notification and moved
the event counters into the fsinfo patchset.


====
WHY?
====

Why do we want mount notifications? Whilst /proc/mounts can be polled, it
only tells you that something changed in your namespace. To find out, you
have to trawl /proc/mounts or similar to work out what changed in the mount
object attributes and mount topology. I'm told that the proc file holding
the namespace_sem is a point of contention, especially as the process of
generating the text descriptions of the mounts/superblocks can be quite
involved.

The notification generated here directly indicates the mounts involved in
any particular event and gives an idea of what the change was.

This is combined with a new fsinfo() system call that allows, amongst other
things, the ability to retrieve in one go an { id, change_counter } tuple
from all the children of a specified mount, allowing buffer overruns to be
dealt with quickly.

This is of use to systemd to improve efficiency:

https://lore.kernel.org/linux-fsdevel/20200227151421.3u74ijhqt6ekbiss@xxxxxxxxxxx/

And it's not just Red Hat that's potentially interested in this:

https://lore.kernel.org/linux-fsdevel/293c9bd3-f530-d75e-c353-ddeabac27cf6@xxxxxxxxx/


David
---
The following changes since commit ba47d845d715a010f7b51f6f89bae32845e6acb7:

Linux 5.8-rc6 (2020-07-19 15:41:18 -0700)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/mount-notifications-20200803

for you to fetch changes up to 841a0dfa511364fa9a8d67512e0643669f1f03e3:

watch_queue: sample: Display mount tree change notifications (2020-08-03 12:15:38 +0100)

----------------------------------------------------------------
Mount notifications

----------------------------------------------------------------
David Howells (5):
watch_queue: Limit the number of watches a user can hold
watch_queue: Make watch_sizeof() check record size
watch_queue: Add security hooks to rule on setting mount watches
watch_queue: Implement mount topology and attribute change notifications
watch_queue: sample: Display mount tree change notifications

Documentation/watch_queue.rst | 12 +-
arch/alpha/kernel/syscalls/syscall.tbl | 1 +
arch/arm/tools/syscall.tbl | 1 +
arch/arm64/include/asm/unistd.h | 2 +-
arch/arm64/include/asm/unistd32.h | 2 +
arch/ia64/kernel/syscalls/syscall.tbl | 1 +
arch/m68k/kernel/syscalls/syscall.tbl | 1 +
arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
arch/parisc/kernel/syscalls/syscall.tbl | 1 +
arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
arch/s390/kernel/syscalls/syscall.tbl | 1 +
arch/sh/kernel/syscalls/syscall.tbl | 1 +
arch/sparc/kernel/syscalls/syscall.tbl | 1 +
arch/x86/entry/syscalls/syscall_32.tbl | 1 +
arch/x86/entry/syscalls/syscall_64.tbl | 1 +
arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
fs/Kconfig | 9 ++
fs/Makefile | 1 +
fs/mount.h | 18 +++
fs/mount_notify.c | 222 ++++++++++++++++++++++++++++
fs/namespace.c | 22 +++
include/linux/dcache.h | 1 +
include/linux/lsm_hook_defs.h | 3 +
include/linux/lsm_hooks.h | 6 +
include/linux/sched/user.h | 3 +
include/linux/security.h | 8 +
include/linux/syscalls.h | 2 +
include/linux/watch_queue.h | 7 +-
include/uapi/asm-generic/unistd.h | 4 +-
include/uapi/linux/watch_queue.h | 31 +++-
kernel/sys_ni.c | 3 +
kernel/watch_queue.c | 8 +
samples/watch_queue/watch_test.c | 41 ++++-
security/security.c | 7 +
37 files changed, 422 insertions(+), 6 deletions(-)
create mode 100644 fs/mount_notify.c