Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]

From: Ian Kent
Date: Tue Mar 03 2020 - 00:36:10 EST


On Mon, 2020-03-02 at 10:09 +0100, Miklos Szeredi wrote:
> On Fri, Feb 28, 2020 at 5:36 PM David Howells <dhowells@xxxxxxxxxx>
> wrote:
> > sysfs also has some other disadvantages for this:
> >
> > (1) There's a potential chicken-and-egg problem in that you have
> > to create a
> > bunch of files and dirs in sysfs for every created mount and
> > superblock
> > (possibly excluding special ones like the socket mount) - but
> > this
> > includes sysfs itself. This might work - provided you create
> > sysfs
> > first.
>
> Sysfs architecture looks something like this (I hope Greg will
> correct
> me if I'm wrong):
>
> device driver -> kobj tree <- sysfs tree
>
> The kobj tree is created by the device driver, and the dentry tree is
> created on demand from the kobj tree. Lifetime of kobjs is bound to
> both the sysfs objects and the device but not the other way round.
> I.e. device can go away while the sysfs object is still being
> referenced, and sysfs can be freely mounted and unmounted
> independently of device initialization.
>
> So there's no ordering requirement between sysfs mounts and other
> mounts. I might be wrong on the details, since mounts are created
> very early in the boot process...
>
> > (2) sysfs is memory intensive. The directory structure has to be
> > backed by
> > dentries and inodes that linger as long as the referenced
> > object does
> > (procfs is more efficient in this regard for files that aren't
> > being
> > accessed)
>
> See above: I don't think dentries and inodes are pinned, only kobjs
> and their associated cruft. Which may be too heavy, depending on the
> details of the kobj tree.
>
> > (3) It gives people extra, indirect ways to pin mount objects and
> > superblocks.
>
> See above.
>
> > For the moment, fsinfo() gives you three ways of referring to a
> > filesystem
> > object:
> >
> > (a) Directly by path.
>
> A path is always representable by an O_PATH descriptor.
>
> > (b) By path associated with an fd.
>
> See my proposal about linking from /proc/$PID/fdmount/$FD ->
> /sys/devices/virtual/mounts/$MOUNT_ID.
>
> > (c) By mount ID (perm checked by working back up the tree).
>
> Check that perm on lookup of /sys/devices/virtual/mounts/$MOUNT_ID.
> The proc symlink would bypass the lookup check by directly jumping to
> the mountinfo dir.
>
> > but will need to add:
> >
> > (d) By fscontext fd (which is hard to find in sysfs). Indeed, the
> > superblock
> > may not even exist yet.
>
> Proc symlink would work for that too.

There's mounts enumeration too, ordering is required to identify the
top (or bottom depending on terminology) with more than one mount on
a mount point.

>
> If sysfs is too heavy, this could be proc or a completely new
> filesystem. The implementation is much less relevant at this stage
> of
> the discussion than the interface.

Ha, proc with the seq file interface, that's already proved to not
work properly and looks difficult to fix.

Ian