Re: [GIT PULL] Filesystem Information

From: Ian Kent
Date: Tue Aug 04 2020 - 21:33:57 EST


On Tue, 2020-08-04 at 16:36 +0200, Miklos Szeredi wrote:
> On Tue, Aug 4, 2020 at 4:15 AM Ian Kent <raven@xxxxxxxxxx> wrote:
> > On Mon, 2020-08-03 at 18:42 +0200, Miklos Szeredi wrote:
> > > On Mon, Aug 3, 2020 at 5:50 PM David Howells <dhowells@xxxxxxxxxx
> > > >
> > > wrote:
> > > > Hi Linus,
> > > >
> > > > Here's a set of patches that adds a system call, fsinfo(), that
> > > > allows
> > > > information about the VFS, mount topology, superblock and files
> > > > to
> > > > be
> > > > retrieved.
> > > >
> > > > The patchset is based on top of the mount notifications
> > > > patchset so
> > > > that
> > > > the mount notification mechanism can be hooked to provide event
> > > > counters
> > > > that can be retrieved with fsinfo(), thereby making it a lot
> > > > faster
> > > > to work
> > > > out which mounts have changed.
> > > >
> > > > Note that there was a last minute change requested by Miklós:
> > > > the
> > > > event
> > > > counter bits got moved from the mount notification patchset to
> > > > this
> > > > one.
> > > > The counters got made atomic_long_t inside the kernel and __u64
> > > > in
> > > > the
> > > > UAPI. The aggregate changes can be assessed by comparing pre-
> > > > change tag,
> > > > fsinfo-core-20200724 to the requested pull tag.
> > > >
> > > > Karel Zak has created preliminary patches that add support to
> > > > libmount[*]
> > > > and Ian Kent has started working on making systemd use these
> > > > and
> > > > mount
> > > > notifications[**].
> > >
> > > So why are you asking to pull at this stage?
> > >
> > > Has anyone done a review of the patchset?
> >
> > I have been working with the patch set as it has evolved for quite
> > a
> > while now.
> >
> > I've been reading the kernel code quite a bit and forwarded
> > questions
> > and minor changes to David as they arose.
> >
> > As for a review, not specifically, but while the series implements
> > a
> > rather large change it's surprisingly straight forward to read.
> >
> > In the time I have been working with it I haven't noticed any
> > problems
> > except for those few minor things that I reported to David early on
> > (in
> > some cases accompanied by simple patches).
> >
> > And more recently (obviously) I've been working with the mount
> > notifications changes and, from a readability POV, I find it's the
> > same as the fsinfo() code.
> >
> > > I think it's obvious that this API needs more work. The
> > > integration
> > > work done by Ian is a good direction, but it's not quite the full
> > > validation and review that a complex new API needs.
> >
> > Maybe but the system call is fundamental to making notifications
> > useful
> > and, as I say, after working with it for quite a while I don't fell
> > there's missing features (that David hasn't added along the way)
> > and
> > have found it provides what's needed for what I'm doing (for mount
> > notifications at least).
>
> Apart from the various issues related to the various mount ID's and
> their sizes, my general comment is (and was always): why are we
> adding
> a multiplexer for retrieval of mostly unrelated binary structures?
>
> <linux/fsinfo.h> is 345 lines. This is not a simple and clean API.
>
> A simple and clean replacement API would be:
>
> int get_mount_attribute(int dfd, const char *path, const char
> *attr_name, char *value_buf, size_t buf_size, int flags);
>
> No header file needed with dubiously sized binary values.
>
> The only argument was performance, but apart from purely synthetic
> microbenchmarks that hasn't been proven to be an issue.
>
> And notice how similar the above interface is to getxattr(), or the
> proposed readfile(). Where has the "everything is a file"
> philosophy
> gone?

Maybe, but that philosophy (in a roundabout way) is what's resulted
in some of the problems we now have. Granted it's blind application
of that philosophy rather than the philosophy itself but that is
what happens.

I get that your comments are driven by the way that philosophy should
be applied which is more of a "if it works best doing it that way then
do it that way, and that's usually a file".

In this case there is a logical division of various types of file
system information and the underlying suggestion is maybe it's time
to move away from the "everything is a file" hard and fast rule,
and get rid of some of the problems that have resulted from it.

The notifications is an example, yes, the delivery mechanism is
a "file" but the design of the queueing mechanism makes a lot of
sense for the throughput that's going to be needed as time marches
on. Then there's different sub-systems each with unique information
that needs to be deliverable some other way because delivering "all"
the information via the notification would be just plain wrong so
a multi-faceted information delivery mechanism makes the most
sense to allow specific targeted retrieval of individual items of
information.

But that also supposes your at least open to the idea that "maybe
not everything should be a file".

>
> I think we already lost that with the xattr API, that should have
> been
> done in a way that fits this philosophy. But given that we have "/"
> as the only special purpose char in filenames, and even repetitions
> are allowed, it's hard to think of a good way to do that. Pity.
>
> Still I think it would be nice to have a general purpose attribute
> retrieval API instead of the multiplicity of binary ioctls, xattrs,
> etc.
>
> Is that totally crazy? Nobody missing the beauty in recently
> introduced APIs?
>
> Thanks,
> Miklos