Re: [RFC v4 3/3] vhost: introduce mdev based hardware backend

From: Tiwei Bie
Date: Fri Sep 20 2019 - 00:24:29 EST


On Tue, Sep 17, 2019 at 03:26:30PM +0800, Jason Wang wrote:
> On 2019/9/17 äå9:02, Tiwei Bie wrote:
> > diff --git a/drivers/vhost/mdev.c b/drivers/vhost/mdev.c
> > new file mode 100644
> > index 000000000000..8c6597aff45e
> > --- /dev/null
> > +++ b/drivers/vhost/mdev.c
> > @@ -0,0 +1,462 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2018-2019 Intel Corporation.
> > + */
> > +
> > +#include <linux/compat.h>
> > +#include <linux/kernel.h>
> > +#include <linux/miscdevice.h>
> > +#include <linux/mdev.h>
> > +#include <linux/module.h>
> > +#include <linux/vfio.h>
> > +#include <linux/vhost.h>
> > +#include <linux/virtio_mdev.h>
> > +
> > +#include "vhost.h"
> > +
> > +struct vhost_mdev {
> > + struct mutex mutex;
> > + struct vhost_dev dev;
> > + struct vhost_virtqueue *vqs;
> > + int nvqs;
> > + u64 state;
> > + u64 features;
> > + u64 acked_features;
> > + struct vfio_group *vfio_group;
> > + struct vfio_device *vfio_device;
> > + struct mdev_device *mdev;
> > +};
> > +
> > +/*
> > + * XXX
> > + * We assume virtio_mdev.ko exposes below symbols for now, as we
> > + * don't have a proper way to access parent ops directly yet.
> > + *
> > + * virtio_mdev_readl()
> > + * virtio_mdev_writel()
> > + */
> > +extern u32 virtio_mdev_readl(struct mdev_device *mdev, loff_t off);
> > +extern void virtio_mdev_writel(struct mdev_device *mdev, loff_t off, u32 val);
>
>
> Need to consider a better approach, I feel we should do it through some kind
> of mdev driver instead of talk to mdev device directly.

Yeah, a better approach is really needed here.
Besides, we may want a way to allow accessing the mdev
device_ops proposed in below series outside the
drivers/vfio/mdev/ directory.

https://lkml.org/lkml/2019/9/12/151

I.e. allow putting mdev drivers outside above directory.


> > +
> > + for (queue_id = 0; queue_id < m->nvqs; queue_id++) {
> > + vq = &m->vqs[queue_id];
> > +
> > + if (!vq->desc || !vq->avail || !vq->used)
> > + break;
> > +
> > + virtio_mdev_writel(mdev, VIRTIO_MDEV_QUEUE_NUM, vq->num);
> > +
> > + if (!vhost_translate_ring_addr(vq, (u64)vq->desc,
> > + vhost_get_desc_size(vq, vq->num),
> > + &addr))
> > + return -EINVAL;
>
>
> Interesting, any reason for doing such kinds of translation to HVA? I
> believe the add should already an IOVA that has been map by VFIO.

Currently, in the software based vhost-kernel and vhost-user
backends, QEMU will pass ring addresses as HVA in SET_VRING_ADDR
ioctl when iotlb isn't enabled. If it's OK to let QEMU pass GPA
in vhost-mdev in this case, then this translation won't be needed.

Thanks,
Tiwei