Re: [PATCH 18/52] virtio-fs: Map cache using the values from the capabilities

From: Stefan Hajnoczi
Date: Mon Dec 17 2018 - 09:56:47 EST


On Mon, Dec 17, 2018 at 11:53:46AM +0100, David Hildenbrand wrote:
> On 14.12.18 14:44, Stefan Hajnoczi wrote:
> > On Thu, Dec 13, 2018 at 01:38:23PM +0100, Cornelia Huck wrote:
> >> On Thu, 13 Dec 2018 13:24:31 +0100
> >> David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>
> >>> On 13.12.18 13:15, Dr. David Alan Gilbert wrote:
> >>>> * David Hildenbrand (david@xxxxxxxxxx) wrote:
> >>>>> On 13.12.18 11:00, Dr. David Alan Gilbert wrote:
> >>>>>> * David Hildenbrand (david@xxxxxxxxxx) wrote:
> >>>>>>> On 13.12.18 10:13, Dr. David Alan Gilbert wrote:
> >>>>>>>> * David Hildenbrand (david@xxxxxxxxxx) wrote:
> >>>>>>>>> On 10.12.18 18:12, Vivek Goyal wrote:
> >>>>>>>>>> Instead of assuming we had the fixed bar for the cache, use the
> >>>>>>>>>> value from the capabilities.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx>
> >>>>>>>>>> ---
> >>>>>>>>>> fs/fuse/virtio_fs.c | 32 +++++++++++++++++---------------
> >>>>>>>>>> 1 file changed, 17 insertions(+), 15 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> >>>>>>>>>> index 60d496c16841..55bac1465536 100644
> >>>>>>>>>> --- a/fs/fuse/virtio_fs.c
> >>>>>>>>>> +++ b/fs/fuse/virtio_fs.c
> >>>>>>>>>> @@ -14,11 +14,6 @@
> >>>>>>>>>> #include <uapi/linux/virtio_pci.h>
> >>>>>>>>>> #include "fuse_i.h"
> >>>>>>>>>>
> >>>>>>>>>> -enum {
> >>>>>>>>>> - /* PCI BAR number of the virtio-fs DAX window */
> >>>>>>>>>> - VIRTIO_FS_WINDOW_BAR = 2,
> >>>>>>>>>> -};
> >>>>>>>>>> -
> >>>>>>>>>> /* List of virtio-fs device instances and a lock for the list */
> >>>>>>>>>> static DEFINE_MUTEX(virtio_fs_mutex);
> >>>>>>>>>> static LIST_HEAD(virtio_fs_instances);
> >>>>>>>>>> @@ -518,7 +513,7 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
> >>>>>>>>>> struct dev_pagemap *pgmap;
> >>>>>>>>>> struct pci_dev *pci_dev;
> >>>>>>>>>> phys_addr_t phys_addr;
> >>>>>>>>>> - size_t len;
> >>>>>>>>>> + size_t bar_len;
> >>>>>>>>>> int ret;
> >>>>>>>>>> u8 have_cache, cache_bar;
> >>>>>>>>>> u64 cache_offset, cache_len;
> >>>>>>>>>> @@ -551,17 +546,13 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
> >>>>>>>>>> }
> >>>>>>>>>>
> >>>>>>>>>> /* TODO handle case where device doesn't expose BAR? */
> >>>>>>>>>
> >>>>>>>>> For virtio-pmem we decided to not go via BARs as this would effectively
> >>>>>>>>> make it only usable for virtio-pci implementers. Instead, we are going
> >>>>>>>>> to export the applicable physical device region directly (e.g.
> >>>>>>>>> phys_start, phys_size in virtio config), so it is decoupled from PCI
> >>>>>>>>> details. Doing the same for virtio-fs would allow e.g. also virtio-ccw
> >>>>>>>>> to make eventually use of this.
> >>>>>>>>
> >>>>>>>> That makes it a very odd looking PCI device; I can see that with
> >>>>>>>> virtio-pmem it makes some sense, given that it's job is to expose
> >>>>>>>> arbitrary chunks of memory.
> >>>>>>>>
> >>>>>>>> Dave
> >>>>>>>
> >>>>>>> Well, the fact that your are
> >>>>>>>
> >>>>>>> - including <uapi/linux/virtio_pci.h>
> >>>>>>> - adding pci related code
> >>>>>>>
> >>>>>>> in/to fs/fuse/virtio_fs.c
> >>>>>>>
> >>>>>>> tells me that these properties might be better communicated on the
> >>>>>>> virtio layer, not on the PCI layer.
> >>>>>>>
> >>>>>>> Or do you really want to glue virtio-fs to virtio-pci for all eternity?
> >>>>>>
> >>>>>> No, these need cleaning up; and the split within the bar
> >>>>>> is probably going to change to be communicated via virtio layer
> >>>>>> rather than pci capabilities. However, I don't want to make our PCI
> >>>>>> device look odd, just to make portability to non-PCI devices - so it's
> >>>>>> right to make the split appropriately, but still to use PCI bars
> >>>>>> for what they were designed for.
> >>>>>>
> >>>>>> Dave
> >>>>>
> >>>>> Let's discuss after the cleanup. In general I am not convinced this is
> >>>>> the right thing to do. Using virtio-pci for anything else than pure
> >>>>> transport smells like bad design to me (well, I am no virtio expert
> >>>>> after all ;) ). No matter what PCI bars were designed for. If we can't
> >>>>> get the same running with e.g. virtio-ccw or virtio-whatever, it is
> >>>>> broken by design (or an addon that is tightly glued to virtio-pci, if
> >>>>> that is the general idea).
> >>>>
> >>>> I'm sure we can find alternatives for virtio-*, so I wouldn't expect
> >>>> it to be glued to virtio-pci.
> >>>>
> >>>> Dave
> >>>
> >>> As s390x does not have the concept of memory mapped io (RAM is RAM,
> >>> nothing else), this is not architectured. vitio-ccw can therefore not
> >>> define anything similar like that. However, in virtual environments we
> >>> can do whatever we want on top of the pure transport (e.g. on the virtio
> >>> layer).
> >>>
> >>> Conny can correct me if I am wrong.
> >>
> >> I don't think you're wrong, but I haven't read the code yet and I'm
> >> therefore not aware of the purpose of this BAR.
> >>
> >> Generally, if there is a memory location shared between host and guest,
> >> we need a way to communicate its location, which will likely differ
> >> between transports. For ccw, I could imagine a new channel command
> >> dedicated to exchanging configuration information (similar to what
> >> exists today to communicate the locations of virtqueues), but I'd
> >> rather not go down this path.
> >>
> >> Without reading the code/design further, can we use one of the
> >> following instead of a BAR:
> >> - a virtqueue;
> >> - something in config space?
> >> That would be implementable by any virtio transport.
> >
> > The way I think about this is that we wish to extend the VIRTIO device
> > model with the concept of shared memory. virtio-fs, virtio-gpu, and
> > virtio-vhost-user all have requirements for shared memory.
> >
> > This seems like a transport-level issue to me. PCI supports
> > memory-mapped I/O and that's the right place to do it. If you try to
> > put it into config space or the virtqueue, you'll end up with something
> > that cannot be realized as a PCI device because it bypasses PCI bus
> > address translation.
> >
> > If CCW needs a side-channel, that's fine. But that side-channel is a
> > CCW-specific mechanism and probably doesn't apply to all other
> > transports.
> >
> > Stefan
> >
>
> I think the problem is more fundamental. There is no iommu. Whatever
> shared region you want to indicate, you want it to be assigned a memory
> region in guest physical memory. Like a DIMM/NVDIMM. And this should be
> different to the concept of a BAR. Or am I missing something?

If you implement a physical virtio PCI adapter then there is bus
addressing and an IOMMU and VIRTIO has support for that. I'm not sure I
understand what you mean by "there is no iommu"?

> I am ok with using whatever other channel to transport such information.
> But I believe this is different to a typical BAR. (I wish I knew more
> about PCI internals ;) ).
>
> I would also like to know how shared memory works as of now for e.g.
> virtio-gpu.

virtio-gpu currently does not use shared memory, it needs it for future
features.

Stefan

Attachment: signature.asc
Description: PGP signature