Re: [PATCH 00/13] VFS: Filesystem information [ver #19]
From: Miklos Szeredi
Date: Thu Mar 19 2020 - 08:37:15 EST
On Thu, Mar 19, 2020 at 11:37 AM David Howells <dhowells@xxxxxxxxxx> wrote:
>
> Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
> > > (2) It's more efficient as we can return specific binary data rather than
> > > making huge text dumps. Granted, sysfs and procfs could present the
> > > same data, though as lots of little files which have to be
> > > individually opened, read, closed and parsed.
> >
> > Asked this a number of times, but you haven't answered yet: what
> > application would require such a high efficiency?
>
> Low efficiency means more time doing this when that time could be spent doing
> other things - or even putting the CPU in a powersaving state. Using an
> open/read/close render-to-text-and-parse interface *will* be slower and less
> efficient as there are more things you have to do to use it.
>
> Then consider doing a walk over all the mounts in the case where there are
> 10000 of them - we have issues with /proc/mounts for such. fsinfo() will end
> up doing a lot less work.
Current /proc/mounts problems arise from the fact that mount info can
only be queried for the whole namespace, and hence changes related to
a single mount will require rescanning the complete mount list. If
mount info can be queried for individual mounts, then the need to scan
the complete list will be rare. That's *the* point of this change.
> > > (3) We wouldn't have the overhead of open and close (even adding a
> > > self-contained readfile() syscall has to do that internally
> >
> > Busted: add f_op->readfile() and be done with all that. For example
> > DEFINE_SHOW_ATTRIBUTE() could be trivially moved to that interface.
>
> Look at your example. "f_op->". That's "file->f_op->" I presume.
>
> You would have to make it "i_op->" to avoid the open and the close - and for
> things like procfs and sysfs, that's probably entirely reasonable - but bear
> in mind that you still have to apply all the LSM file security controls, just
> in case the backing filesystem is, say, ext4 rather than procfs.
>
> > We could optimize existing proc, sys, etc. interfaces, but it's not
> > been an issue, apparently.
>
> You can't get rid of or change many of the existing interfaces. A lot of them
> are effectively indirect system calls and are, as such, part of the fixed
> UAPI. You'd have to add a parallel optimised set.
Sure.
We already have the single_open() internal API that is basically a
->readfile() wrapper. Moving this up to the f_op level (no, it's not
an i_op, and yes, we do need struct file, but it can be simply
allocated on the stack) is a trivial optimization that would let a
readfile(2) syscall access that level. No new complexity in that
case. Same generally goes for seq_file: seq_readfile() is trivial
to implement without messing with current implementation or any
existing APIs.
>
> > > (6) Don't have to create/delete a bunch of sysfs/procfs nodes each time a
> > > mount happens or is removed - and since systemd makes much use of
> > > mount namespaces and mount propagation, this will create a lot of
> > > nodes.
> >
> > Not true.
>
> This may not be true if you roll your own special filesystem. It *is* true if
> you do it in procfs or sysfs. The files don't exist if you don't create nodes
> or attribute tables for them.
That's one of the reasons why I opted to roll my own. But the ideas
therein could be applied to kernfs, if found to be generally useful.
Nothing magic about that.
>
> > > The argument for doing this through procfs/sysfs/somemagicfs is that
> > > someone using a shell can just query the magic files using ordinary text
> > > tools, such as cat - and that has merit - but it doesn't solve the
> > > query-by-pathname problem.
> > >
> > > The suggested way around the query-by-pathname problem is to open the
> > > target file O_PATH and then look in a magic directory under procfs
> > > corresponding to the fd number to see a set of attribute files[*] laid out.
> > > Bash, however, can't open by O_PATH or O_NOFOLLOW as things stand...
> >
> > Bash doesn't have fsinfo(2) either, so that's not really a good argument.
>
> I never claimed that fsinfo() could be accessed directly from the shell. For
> you proposal, you claimed "immediately usable from all programming languages,
> including scripts".
You are right. Note however: only special files need the O_PATH
handling, regular files are directories can be opened by the shell
without side effects.
In any case, I think neither of us can be convinced of the other's
right, so I guess It's up to Al and Linus to make a decision.
Thanks,
Miklos