Re: [PATCH 00/52] [RFC] virtio-fs: shared file system for virtual machines

From: Vivek Goyal
Date: Wed Dec 12 2018 - 16:22:43 EST


On Wed, Dec 12, 2018 at 03:30:49PM -0500, Konrad Rzeszutek Wilk wrote:
> On Mon, Dec 10, 2018 at 12:12:26PM -0500, Vivek Goyal wrote:
> > Hi,
> >
> > Here are RFC patches for virtio-fs. Looking for feedback on this approach.
> >
> > These patches should apply on top of 4.20-rc5. We have also put code for
> > various components here.
> >
> > https://gitlab.com/virtio-fs
> >
> > Problem Description
> > ===================
> > We want to be able to take a directory tree on the host and share it with
> > guest[s]. Our goal is to be able to do it in a fast, consistent and secure
> > manner. Our primary use case is kata containers, but it should be usable in
> > other scenarios as well.
> >
> > Containers may rely on local file system semantics for shared volumes,
> > read-write mounts that multiple containers access simultaneously. File
> > system changes must be visible to other containers with the same consistency
> > expected of a local file system, including mmap MAP_SHARED.
> >
> > Existing Solutions
> > ==================
> > We looked at existing solutions and virtio-9p already provides basic shared
> > file system functionality although does not offer local file system semantics,
> > causing some workloads and test suites to fail. In addition, virtio-9p
> > performance has been an issue for Kata Containers and we believe this cannot
> > be alleviated without major changes that do not fit into the 9P protocol.
> >
> > Design Overview
> > ===============
> > With the goal of designing something with better performance and local file
> > system semantics, a bunch of ideas were proposed.
> >
> > - Use fuse protocol (instead of 9p) for communication between guest
> > and host. Guest kernel will be fuse client and a fuse server will
> > run on host to serve the requests. Benchmark results (see below) are
> > encouraging and show this approach performs well (2x to 8x improvement
> > depending on test being run).
> >
> > - For data access inside guest, mmap portion of file in QEMU address
> > space and guest accesses this memory using dax. That way guest page
> > cache is bypassed and there is only one copy of data (on host). This
> > will also enable mmap(MAP_SHARED) between guests.
> >
> > - For metadata coherency, there is a shared memory region which contains
> > version number associated with metadata and any guest changing metadata
> > updates version number and other guests refresh metadata on next
> > access. This is still experimental and implementation is not complete.
>
> What about Windows guests or BSD ones? Is there a plan to make that work with them as well?

Hi Konrad,

I have not thought much about making it work on Windows or BSD yet.
Does Fuse work with windows. I am assuming it does with BSD. As long as FUSE
works, I am assuming that atleast basic mode can be made to work.

>
> What about the Virtio spec? Plans to make changes there as well?

There are plans to change that. Stefan posted a proposal here.

https://lists.oasis-open.org/archives/virtio-dev/201812/msg00073.html

Thanks
Vivek