Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]

From: Jeff Layton
Date: Tue Mar 03 2020 - 11:51:06 EST


On Tue, 2020-03-03 at 08:44 -0700, Jens Axboe wrote:
> On 3/3/20 7:24 AM, Greg Kroah-Hartman wrote:
> > On Tue, Mar 03, 2020 at 03:13:26PM +0100, Jann Horn wrote:
> > > On Tue, Mar 3, 2020 at 3:10 PM Greg Kroah-Hartman
> > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > On Tue, Mar 03, 2020 at 02:43:16PM +0100, Greg Kroah-Hartman wrote:
> > > > > On Tue, Mar 03, 2020 at 02:34:42PM +0100, Miklos Szeredi wrote:
> > > > > > On Tue, Mar 3, 2020 at 2:14 PM Greg Kroah-Hartman
> > > > > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > > > Unlimited beers for a 21-line kernel patch? Sign me up!
> > > > > > > >
> > > > > > > > Totally untested, barely compiled patch below.
> > > > > > >
> > > > > > > Ok, that didn't even build, let me try this for real now...
> > > > > >
> > > > > > Some comments on the interface:
> > > > >
> > > > > Ok, hey, let's do this proper :)
> > > >
> > > > Alright, how about this patch.
> > > >
> > > > Actually tested with some simple sysfs files.
> > > >
> > > > If people don't strongly object, I'll add "real" tests to it, hook it up
> > > > to all arches, write a manpage, and all the fun fluff a new syscall
> > > > deserves and submit it "for real".
> > >
> > > Just FYI, io_uring is moving towards the same kind of thing... IIRC
> > > you can already use it to batch a bunch of open() calls, then batch a
> > > bunch of read() calls on all the new fds and close them at the same
> > > time. And I think they're planning to add support for doing
> > > open()+read()+close() all in one go, too, except that it's a bit
> > > complicated because passing forward the file descriptor in a generic
> > > way is a bit complicated.
> >
> > It is complicated, I wouldn't recommend using io_ring for reading a
> > bunch of procfs or sysfs files, that feels like a ton of overkill with
> > too much setup/teardown to make it worth while.
> >
> > But maybe not, will have to watch and see how it goes.
>
> It really isn't, and I too thinks it makes more sense than having a
> system call just for the explicit purpose of open/read/close. As Jann
> said, you can't currently do a linked sequence of open/read/close,
> because the fd passing between them isn't done. But that will come in
> the future. If the use case is "a bunch of files", then you could
> trivially do "open bunch", "read bunch", "close bunch" in three separate
> steps.
>
> Curious what the use case is for this that warrants a special system
> call?
>

Agreed. I'd really rather see something more general-purpose than the
proposed readfile(). At least with NFS and SMB, you can compound
together fairly arbitrary sorts of operations, and it'd be nice to be
able to pattern calls into the kernel for those sorts of uses.

So, NFSv4 has the concept of a current_stateid that is maintained by the
server. So basically you can do all this (e.g.) in a single compound:

open <some filehandle get a stateid>
write <using that stateid>
close <same stateid>

It'd be nice to be able to do something similar with io_uring. Make it
so that when you do an open, you set the "current fd" inside the
kernel's context, and then be able to issue io_uring requests that
specify a magic "fd" value that use it.

That would be a really useful pattern.
--
Jeff Layton <jlayton@xxxxxxxxxx>