Re: [PATCH 15/52] fuse: map virtio_fs DAX window BAR

From: Dr. David Alan Gilbert
Date: Fri Dec 14 2018 - 05:10:09 EST


* Vivek Goyal (vgoyal@xxxxxxxxxx) wrote:
> On Thu, Dec 13, 2018 at 03:40:52PM -0500, Vivek Goyal wrote:
> > On Thu, Dec 13, 2018 at 12:15:51PM -0800, Dan Williams wrote:
> > > On Thu, Dec 13, 2018 at 12:09 PM Dr. David Alan Gilbert
> > > <dgilbert@xxxxxxxxxx> wrote:
> > > >
> > > > * Dan Williams (dan.j.williams@xxxxxxxxx) wrote:
> > > > > On Mon, Dec 10, 2018 at 9:22 AM Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > From: Stefan Hajnoczi <stefanha@xxxxxxxxxx>
> > > > > >
> > > > > > Experimental QEMU code introduces an MMIO BAR for mapping portions of
> > > > > > files in the virtio-fs device. Map this BAR so that FUSE DAX can access
> > > > > > file contents from the host page cache.
> > > > >
> > > > > FUSE DAX sounds terrifying, can you explain a bit more about what this is?
> > > >
> > > > We've got a guest running in QEMU, it sees an emulated PCI device;
> > > > that runs a FUSE protocol over virtio on that PCI device, but also has
> > > > a trick where via commands sent over the virtio queue associated with that device,
> > > > (fragments of) host files get mmap'd into the qemu virtual memory that corresponds
> > > > to the kvm slot exposed to the guest for that bar.
> > > >
> > > > The guest sees those chunks in that BAR, and thus you can read/write
> > > > to the host file by directly writing into that BAR.
> > >
> > > Ok so it's all software emulated and there won't be hardware DMA
> > > initiated by the guest to that address?
> >
> > That's my understanding.
> >
> > > I.e. if the host file gets
> > > truncated / hole-punched the guest would just cause a refault and the
> > > filesystem could fill in the block,
> >
> > Right
> >
> > > or the guest is expected to die if
> > > the fault to the truncated file range results in SIGBUS.
> >
> > Are you referring to the case where a file page is mapped in qemu and
> > another guest/process trucates that page and when qemu tries to access it it
> > will get SIGBUS. Have not tried it, will give it a try. Not sure what
> > happens when QEMU receives SIGBUS.
> >
> > Having said that, this is not different from the case of one process
> > mapping a file and another process truncating the file and first process
> > getting SIGBUS, right?
>
> Ok, tried this and guest process hangs.
>
> Stefan, dgilbert, this reminds me that we have faced this issue during
> our testing and we decided that this will need some fixing in KVM. I
> even put this in as part of changelog of patch with subject "fuse: Take
> inode lock for dax inode truncation"
>
> "Another problem is, if we setup a mapping in fuse_iomap_begin(), and
> file gets truncated and dax read/write happens, KVM currently hangs.
> It tries to fault in a page which does not exist on host (file got
> truncated). It probably requries fixing in KVM."
>
> Not sure what should happen though when qemu receives SIGBUS in this
> case.

Yes, and I noted it in the TODO in my qemu patch posting.

We need to figure out what we want the guest to see in this case and
figure out how to make QEMU/kvm fix it up so that the guest doesn't
see anything odd.

Dave

> Thanks
> Vivek
--
Dr. David Alan Gilbert / dgilbert@xxxxxxxxxx / Manchester, UK