Re: [GIT PULL] Mount notifications

From: Ian Kent
Date: Mon Aug 03 2020 - 18:48:58 EST


On Mon, 2020-08-03 at 16:27 +0100, David Howells wrote:
> Hi Linus,
>
> Here's a set of patches to add notifications for mount topology
> events,
> such as mounting, unmounting, mount expiry, mount reconfiguration.
>
> The first patch in the series adds a hard limit on the number of
> watches
> that any particular user can add. The RLIMIT_NOFILE value for the
> process
> adding a watch is used as the limit. Even if you don't take the rest
> of
> the series, can you at least take this one?
>
> An LSM hook is included for an LSM to rule on whether or not a mount
> watch
> may be set on a particular path.
>
> This series is intended to be taken in conjunction with the fsinfo
> series
> which I'll post a pull request for shortly and which is dependent on
> it.
>
> Karel Zak[*] has created preliminary patches that add support to
> libmount
> and Ian Kent has started working on making systemd use them.
>
> [*] https://github.com/karelzak/util-linux/commits/topic/fsinfo
>
> Note that there have been some last minute changes to the patchset:
> you
> wanted something adding and Miklós wanted some bits taking
> out/changing.
> I've placed a tag, fsinfo-core-20200724 on the aggregate of these two
> patchsets that can be compared to fsinfo-core-20200803.
>
> To summarise the changes: I added the limiter that you wanted;
> removed an
> unused symbol; made the mount ID fields in the notificaion 64-bit
> (the
> fsinfo patchset has a change to convey the mount uniquifier instead
> of the
> mount ID); removed the event counters from the mount notification and
> moved
> the event counters into the fsinfo patchset.

I've pushed my systemd changes to a github repo.
I haven't yet updated it with the changes above but will get to it.

They can be found at:
https://github.com/raven-au/systemd.git branch notifications-devel

>
>
> ====
> WHY?
> ====
>
> Why do we want mount notifications? Whilst /proc/mounts can be
> polled, it
> only tells you that something changed in your namespace. To find
> out, you
> have to trawl /proc/mounts or similar to work out what changed in the
> mount
> object attributes and mount topology. I'm told that the proc file
> holding
> the namespace_sem is a point of contention, especially as the process
> of
> generating the text descriptions of the mounts/superblocks can be
> quite
> involved.
>
> The notification generated here directly indicates the mounts
> involved in
> any particular event and gives an idea of what the change was.
>
> This is combined with a new fsinfo() system call that allows, amongst
> other
> things, the ability to retrieve in one go an { id, change_counter }
> tuple
> from all the children of a specified mount, allowing buffer overruns
> to be
> dealt with quickly.
>
> This is of use to systemd to improve efficiency:
>
>
> https://lore.kernel.org/linux-fsdevel/20200227151421.3u74ijhqt6ekbiss@xxxxxxxxxxx/
>
> And it's not just Red Hat that's potentially interested in this:
>
>
> https://lore.kernel.org/linux-fsdevel/293c9bd3-f530-d75e-c353-ddeabac27cf6@xxxxxxxxx/
>
>
> David
> ---
> The following changes since commit
> ba47d845d715a010f7b51f6f89bae32845e6acb7:
>
> Linux 5.8-rc6 (2020-07-19 15:41:18 -0700)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
> tags/mount-notifications-20200803
>
> for you to fetch changes up to
> 841a0dfa511364fa9a8d67512e0643669f1f03e3:
>
> watch_queue: sample: Display mount tree change notifications (2020-
> 08-03 12:15:38 +0100)
>
> ----------------------------------------------------------------
> Mount notifications
>
> ----------------------------------------------------------------
> David Howells (5):
> watch_queue: Limit the number of watches a user can hold
> watch_queue: Make watch_sizeof() check record size
> watch_queue: Add security hooks to rule on setting mount
> watches
> watch_queue: Implement mount topology and attribute change
> notifications
> watch_queue: sample: Display mount tree change notifications
>
> Documentation/watch_queue.rst | 12 +-
> arch/alpha/kernel/syscalls/syscall.tbl | 1 +
> arch/arm/tools/syscall.tbl | 1 +
> arch/arm64/include/asm/unistd.h | 2 +-
> arch/arm64/include/asm/unistd32.h | 2 +
> arch/ia64/kernel/syscalls/syscall.tbl | 1 +
> arch/m68k/kernel/syscalls/syscall.tbl | 1 +
> arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
> arch/mips/kernel/syscalls/syscall_n32.tbl | 1 +
> arch/mips/kernel/syscalls/syscall_n64.tbl | 1 +
> arch/mips/kernel/syscalls/syscall_o32.tbl | 1 +
> arch/parisc/kernel/syscalls/syscall.tbl | 1 +
> arch/powerpc/kernel/syscalls/syscall.tbl | 1 +
> arch/s390/kernel/syscalls/syscall.tbl | 1 +
> arch/sh/kernel/syscalls/syscall.tbl | 1 +
> arch/sparc/kernel/syscalls/syscall.tbl | 1 +
> arch/x86/entry/syscalls/syscall_32.tbl | 1 +
> arch/x86/entry/syscalls/syscall_64.tbl | 1 +
> arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
> fs/Kconfig | 9 ++
> fs/Makefile | 1 +
> fs/mount.h | 18 +++
> fs/mount_notify.c | 222
> ++++++++++++++++++++++++++++
> fs/namespace.c | 22 +++
> include/linux/dcache.h | 1 +
> include/linux/lsm_hook_defs.h | 3 +
> include/linux/lsm_hooks.h | 6 +
> include/linux/sched/user.h | 3 +
> include/linux/security.h | 8 +
> include/linux/syscalls.h | 2 +
> include/linux/watch_queue.h | 7 +-
> include/uapi/asm-generic/unistd.h | 4 +-
> include/uapi/linux/watch_queue.h | 31 +++-
> kernel/sys_ni.c | 3 +
> kernel/watch_queue.c | 8 +
> samples/watch_queue/watch_test.c | 41 ++++-
> security/security.c | 7 +
> 37 files changed, 422 insertions(+), 6 deletions(-)
> create mode 100644 fs/mount_notify.c
>