Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]

From: Karel Zak
Date: Thu Feb 27 2020 - 10:14:40 EST


On Thu, Feb 27, 2020 at 02:45:27PM +0100, Miklos Szeredi wrote:
> > So the problem I want to see fixed is the effect of very large
> > mount tables on other user space applications, particularly the
> > effect when a large number of mounts or umounts are performed.

Yes, now you have to generate (in kernel) and parse (in
userspace) all mount table to get information about just
one mount table entry. This is typical for umount or systemd.

> > > - add a notification mechanism - lookup a mount based on path
> > > - and a way to selectively query mount/superblock information
> > based on path ...

For umount-like use-cases we need mountpoint/ to mount entry
conversion; I guess something like open(mountpoint/) + fsinfo()
should be good enough.

For systemd we need the same, but triggered by notification. The ideal
solution is to get mount entry ID or FD from notification and later use this
ID or FD to ask for details about the mount entry (probably again fsinfo()).
The notification has to be usable with in epoll() set.

This solves 99% of our performance issues I guess.

> > So that means mount table info. needs to be maintained, whether that
> > can be achieved using sysfs I don't know. Creating and maintaining
> > the sysfs tree would be a big challenge I think.

It will be still necessary to get complete mount table sometimes, but
not in performance sensitive scenarios.

I'm not sure about sysfs/, you need somehow resolve namespaces, order
of the mount entries (which one is the last one), etc. IMHO translate
mountpoint path to sysfs/ path will be complicated.

> > But before trying to work out how to use a notification mechanism
> > just having a way to get the info provided by the proc tables using
> > a path alone should give initial immediate improvement in libmount.
>
> Adding Karel, Lennart, Zbigniew and util-linux@xxxxxxx
>
> At a quick glance at libmount and systemd code, it appears that just
> switching out the implementation in libmount will not be enough:
> systemd is calling functions like mnt_table_parse_*() when it receives
> a notification that the mount table changed.

We're ready to change this stuff in systemd if there will be something
better (something per-mount-entry).

My plan is add new API to libmount to query information about one
mount entry (but I had no time to play with fsinfo yet).

> What is the end purpose of parsing the mount tables? Can systemd guys
> comment on that?

If mount/umount is triggered by systemd than it need verification
about success and final version of the mount options. It also reads
information from libmount to get userspace mount options (.e.g.
_netdev -- libmount uses mount source, target and fsroot to join
kernel and userpace stuff).

And don't forget that mount units are part of systemd dependencies, so
umount/mount is important event for systemd and it need details about
the changes (what, where, ... etc.)

Karel

--
Karel Zak <kzak@xxxxxxxxxx>
http://karelzak.blogspot.com