Re: [PATCH RESEND V12 4/8] fuse: Passthrough initialization and release
From: Alessio Balsini
Date: Mon May 17 2021 - 07:36:55 EST
On Wed, May 05, 2021 at 03:21:26PM +0300, Amir Goldstein wrote:
> On Wed, Feb 17, 2021 at 3:52 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >
> > On Mon, Jan 25, 2021 at 4:31 PM Alessio Balsini <balsini@xxxxxxxxxxx> wrote:
> > >
> > > Implement the FUSE passthrough ioctl that associates the lower
> > > (passthrough) file system file with the fuse_file.
> > >
> > > The file descriptor passed to the ioctl by the FUSE daemon is used to
> > > access the relative file pointer, that will be copied to the fuse_file
> > > data structure to consolidate the link between the FUSE and lower file
> > > system.
> > >
> > > To enable the passthrough mode, user space triggers the
> > > FUSE_DEV_IOC_PASSTHROUGH_OPEN ioctl and, if the call succeeds, receives
> > > back an identifier that will be used at open/create response time in the
> > > fuse_open_out field to associate the FUSE file to the lower file system
> > > file.
> > > The value returned by the ioctl to user space can be:
> > > - > 0: success, the identifier can be used as part of an open/create
> > > reply.
> > > - <= 0: an error occurred.
> > > The value 0 represents an error to preserve backward compatibility: the
> > > fuse_open_out field that is used to pass the passthrough_fh back to the
> > > kernel uses the same bits that were previously as struct padding, and is
> > > commonly zero-initialized (e.g., in the libfuse implementation).
> > > Removing 0 from the correct values fixes the ambiguity between the case
> > > in which 0 corresponds to a real passthrough_fh, a missing
> > > implementation of FUSE passthrough or a request for a normal FUSE file,
> > > simplifying the user space implementation.
> > >
> > > For the passthrough mode to be successfully activated, the lower file
> > > system file must implement both read_iter and write_iter file
> > > operations. This extra check avoids special pseudo files to be targeted
> > > for this feature.
> > > Passthrough comes with another limitation: no further file system
> > > stacking is allowed for those FUSE file systems using passthrough.
> > >
> > > Signed-off-by: Alessio Balsini <balsini@xxxxxxxxxxx>
> > > ---
> > > fs/fuse/inode.c | 5 +++
> > > fs/fuse/passthrough.c | 87 ++++++++++++++++++++++++++++++++++++++++++-
> > > 2 files changed, 90 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > > index a1104d5abb70..7ebc398fbacb 100644
> > > --- a/fs/fuse/inode.c
> > > +++ b/fs/fuse/inode.c
> > > @@ -1133,6 +1133,11 @@ EXPORT_SYMBOL_GPL(fuse_send_init);
> > >
> > > static int free_fuse_passthrough(int id, void *p, void *data)
> > > {
> > > + struct fuse_passthrough *passthrough = (struct fuse_passthrough *)p;
> > > +
> > > + fuse_passthrough_release(passthrough);
> > > + kfree(p);
> > > +
> > > return 0;
> > > }
> > >
> > > diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
> > > index 594060c654f8..cf993e83803e 100644
> > > --- a/fs/fuse/passthrough.c
> > > +++ b/fs/fuse/passthrough.c
> > > @@ -3,19 +3,102 @@
> > > #include "fuse_i.h"
> > >
> > > #include <linux/fuse.h>
> > > +#include <linux/idr.h>
> > >
> > > int fuse_passthrough_open(struct fuse_dev *fud,
> > > struct fuse_passthrough_out *pto)
> > > {
> > > - return -EINVAL;
> > > + int res;
> > > + struct file *passthrough_filp;
> > > + struct fuse_conn *fc = fud->fc;
> > > + struct inode *passthrough_inode;
> > > + struct super_block *passthrough_sb;
> > > + struct fuse_passthrough *passthrough;
> > > +
> > > + if (!fc->passthrough)
> > > + return -EPERM;
> > > +
> > > + /* This field is reserved for future implementation */
> > > + if (pto->len != 0)
> > > + return -EINVAL;
> > > +
> > > + passthrough_filp = fget(pto->fd);
> > > + if (!passthrough_filp) {
> > > + pr_err("FUSE: invalid file descriptor for passthrough.\n");
> > > + return -EBADF;
> > > + }
> > > +
> > > + if (!passthrough_filp->f_op->read_iter ||
> > > + !passthrough_filp->f_op->write_iter) {
> > > + pr_err("FUSE: passthrough file misses file operations.\n");
> > > + res = -EBADF;
> > > + goto err_free_file;
> > > + }
> > > +
> > > + passthrough_inode = file_inode(passthrough_filp);
> > > + passthrough_sb = passthrough_inode->i_sb;
> > > + if (passthrough_sb->s_stack_depth >= FILESYSTEM_MAX_STACK_DEPTH) {
> > > + pr_err("FUSE: fs stacking depth exceeded for passthrough\n");
> >
> > No need to print an error to the logs, this can be a perfectly normal
> > occurrence. However I'd try to find a more unique error value than
> > EINVAL so that the fuse server can interpret this as "not your fault,
> > but can't support passthrough on this file". E.g. EOPNOTSUPP.
> >
> >
>
> Sorry for the fashionably late response...
> Same comment for !{read,write}_iter case above.
> EBAFD is really not appropriate there.
> May I suggest ELOOP for s_stack_depth and EOPNOTSUPP
> for no rw iter ops.
>
> Are you planning to post another version of the patches soon?
>
> Thanks,
> Amir.
Hi Amir,
Thanks for your feedback. I like the ELOOP for the stack_depth, and the
EOPNOTSUPP is already in my local branch.
I've been busy with the integration of this last patchset in Android,
that unfortunately will be shipped as out-of-tree. I'm keeping all my
fingers crossed for my workarounds to prevent future passthrough UAPI
breakages to work. :)
I have an ugly patch which uses IDR as Miklos asked, but unfortunately
I'm facing some performance issues due to the locking mechanisms to keep
guarantee the RCU consistency. I can post the new patch set as an RFC
soon for the community to take a look.
At a glance what happens is:
- the IDR, one for each fuse_conn, contains pointers to "struct
fuse_passthrough" containing:
- fuse_file *: which is using passthrough,
- file *: native file system file target,
- cred of the FUSE server,
- ioctl(PASSTHROUGH_OPEN): updates IDR, requires spinlock:
- kmalloc(fuse_passthrough), update file and cred,
- ID = idr_alloc(),
- return ID,
- fuse_open reply from FUSE server with passthrough ID: updates IDR,
requires spinlock:
- pt = idr_find(ID),
- update fuse_file with the current fuse_file,
- update fuse_file->passthrough_id = ID,
- idr_replace(),
- read/write/mmap: lock-free IDR read:
- idr_find(fuse_file::passthrough_id),
- forward request to lower file system as for the current FUSE
passthrough patches.
- ioctl(PASSTHROUGH_CLOSE): updates IDR, requires spinlock:
- idr_remove();
- call_rcu(): tell possible open fuse_file user that the ID is no more
valid and kfree() the allocated struct;
- close(fuse_file): updates IDR, requires spinlock:
- ID = fuse_file::passthrough_id
- idr_find(ID),
- fuse_passthrough::fuse_file = NULL,
- idr_replace().
This would be fine if passthrough is activated for a few, big files,
where the passthrough overhead is dominated by the direct access
benefits, but if passthrough is enabled for many small files which just
do one or two read/writes (as what I did for my benchmark of total build
time for the kernel, where I was passing-through every opened file), the
overhead becomes a real issue.
If you have any thoughts on how to make this simpler, I'm absolutely
open to fix this.
We are also in the preliminary design step of having a hierarchical
passthrough feature, for which all the dir/file operations performed in
any of the contents of the passthrough folder, are automatically
forwarded to the lower file system.
That may fix this overhead issue, but it's still early to forecast what
is going to happen with this idea. Right now I'm not even sure it's
feasible.
Looking forward to receiving some feedback, thanks in advance!
Alessio