Re: [PATCH 00/52] [RFC] virtio-fs: shared file system for virtual machines
From: Konrad Rzeszutek Wilk
Date: Wed Dec 12 2018 - 15:30:58 EST
On Mon, Dec 10, 2018 at 12:12:26PM -0500, Vivek Goyal wrote:
> Hi,
>
> Here are RFC patches for virtio-fs. Looking for feedback on this approach.
>
> These patches should apply on top of 4.20-rc5. We have also put code for
> various components here.
>
> https://gitlab.com/virtio-fs
>
> Problem Description
> ===================
> We want to be able to take a directory tree on the host and share it with
> guest[s]. Our goal is to be able to do it in a fast, consistent and secure
> manner. Our primary use case is kata containers, but it should be usable in
> other scenarios as well.
>
> Containers may rely on local file system semantics for shared volumes,
> read-write mounts that multiple containers access simultaneously. File
> system changes must be visible to other containers with the same consistency
> expected of a local file system, including mmap MAP_SHARED.
>
> Existing Solutions
> ==================
> We looked at existing solutions and virtio-9p already provides basic shared
> file system functionality although does not offer local file system semantics,
> causing some workloads and test suites to fail. In addition, virtio-9p
> performance has been an issue for Kata Containers and we believe this cannot
> be alleviated without major changes that do not fit into the 9P protocol.
>
> Design Overview
> ===============
> With the goal of designing something with better performance and local file
> system semantics, a bunch of ideas were proposed.
>
> - Use fuse protocol (instead of 9p) for communication between guest
> and host. Guest kernel will be fuse client and a fuse server will
> run on host to serve the requests. Benchmark results (see below) are
> encouraging and show this approach performs well (2x to 8x improvement
> depending on test being run).
>
> - For data access inside guest, mmap portion of file in QEMU address
> space and guest accesses this memory using dax. That way guest page
> cache is bypassed and there is only one copy of data (on host). This
> will also enable mmap(MAP_SHARED) between guests.
>
> - For metadata coherency, there is a shared memory region which contains
> version number associated with metadata and any guest changing metadata
> updates version number and other guests refresh metadata on next
> access. This is still experimental and implementation is not complete.
What about Windows guests or BSD ones? Is there a plan to make that work with them as well?
What about the Virtio spec? Plans to make changes there as well?