Re: [RFC PATCH 00/20] Introduce the famfs shared-memory file system

From: Amir Goldstein
Date: Sun May 19 2024 - 01:59:40 EST


On Fri, May 17, 2024 at 12:55 PM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
> On Thu, 29 Feb 2024 at 07:52, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>
> > I'm not virtiofs expert, but I don't think that you are wrong about this.
> > IIUC, virtiofsd could map arbitrary memory region to any fuse file mmaped
> > by virtiofs client.
> >
> > So what are the gaps between virtiofs and famfs that justify a new filesystem
> > driver and new userspace API?
>
> Let me try to fill in some gaps. I've looked at the famfs driver
> (even tried to set it up in a VM, but got stuck with the EFI stuff).
>
> - famfs has an extent list per file that indicates how each page
> within the file should be mapped onto the dax device, IOW it has the
> following mapping:
>
> [famfs file, offset] -> [offset, length]
>
> - fuse can currently map a fuse file onto a backing file:
>
> [fuse file] -> [backing file]
>
> The interface for the latter is
>
> backing_id = ioctl(dev_fuse_fd, FUSE_DEV_IOC_BACKING_OPEN, backing_map);
> ...
> fuse_open_out.flags |= FOPEN_PASSTHROUGH;
> fuse_open_out.backing_id = backing_id;

FYI, library and example code was recently merged to libfuse:
https://github.com/libfuse/libfuse/pull/919

>
> This looks suitable for doing the famfs file - > dax device mapping as
> well. I wouldn't extend the ioctl with extent information, since
> famfs can just use FUSE_DEV_IOC_BACKING_OPEN once to register the dax
> device. The flags field could be used to tell the kernel to treat
> this fd as a dax device instead of a a regular file.
>
> Letter, when the file is opened the extent list could be sent in the
> open reply together with the backing id. The fuse_ext_header
> mechanism seems suitable for this.
>
> And I think that's it as far as API's are concerned.
>
> Note: this is already more generic than the current famfs prototype,
> since multiple dax devices could be used as backing for famfs files,
> with the constraint that a single file can only map data from a single
> dax device.
>
> As for implementing dax passthrough, I think that needs a separate
> source file, the one used by virtiofs (fs/fuse/dax.c) does not appear
> to have many commonalities with this one. That could be renamed to
> virtiofs_dax.c as it's pretty much virtiofs specific, AFAICT.
>
> Comments?

Would probably also need to decouple CONFIG_FUSE_DAX
from CONFIG_FUSE_VIRTIO_DAX.

What about fc->dax_mode (i.e. dax= mount option)?

What about FUSE_IS_DAX()? does it apply to both dax implementations?

Sounds like a decent plan.
John, let us know if you need help understanding the details.

> Am I missing something significant?

Would we need to set IS_DAX() on inode init time or can we set it
later on first file open?

Currently, iomodes enforces that all opens are either
mapped to same backing file or none mapped to backing file:

fuse_inode_uncached_io_start()
{
..
/* deny conflicting backing files on same fuse inode */

The iomodes rules will need to be amended to verify that:
- IS_DAX() inode open is always mapped to backing dax device
- All files of the same fuse inode are mapped to the same range
of backing file/dax device.

Thanks,
Amir.