Re: [PATCH v2 2/2] drm/i915/gvt: subsitute kvm_read/write_guest with vfio_dma_rw
From: Alex Williamson
Date: Wed Jan 15 2020 - 15:06:56 EST
On Tue, 14 Jan 2020 22:54:55 -0500
Yan Zhao <yan.y.zhao@xxxxxxxxx> wrote:
> As a device model, it is better to read/write guest memory using vfio
> interface, so that vfio is able to maintain dirty info of device IOVAs.
>
> Compared to kvm interfaces kvm_read/write_guest(), vfio_dma_rw() has ~600
> cycles more overhead on average.
>
> -------------------------------------
> | interface | avg cpu cycles |
> |-----------------------------------|
> | kvm_write_guest | 1554 |
> | ----------------------------------|
> | kvm_read_guest | 707 |
> |-----------------------------------|
> | vfio_dma_rw(w) | 2274 |
> |-----------------------------------|
> | vfio_dma_rw(r) | 1378 |
> -------------------------------------
In v1 you had:
-------------------------------------
| interface | avg cpu cycles |
|-----------------------------------|
| kvm_write_guest | 1546 |
| ----------------------------------|
| kvm_read_guest | 686 |
|-----------------------------------|
| vfio_iova_rw(w) | 2233 |
|-----------------------------------|
| vfio_iova_rw(r) | 1262 |
-------------------------------------
So the kvm numbers remained within +0.5-3% while the vfio numbers are
now +1.8-9.2%. I would have expected the algorithm change to at least
not be worse for small accesses and be better for accesses crossing
page boundaries. Do you know what happened?
> Comparison of benchmarks scores are as blow:
> ------------------------------------------------------
> | avg score | kvm_read/write_guest | vfio_dma_rw |
> |----------------------------------------------------|
> | Glmark2 | 1284 | 1296 |
> |----------------------------------------------------|
> | Lightsmark | 61.24 | 61.27 |
> |----------------------------------------------------|
> | OpenArena | 140.9 | 137.4 |
> |----------------------------------------------------|
> | Heaven | 671 | 670 |
> ------------------------------------------------------
> No obvious performance downgrade found.
>
> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>
> Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
> ---
> drivers/gpu/drm/i915/gvt/kvmgt.c | 26 +++++++-------------------
> 1 file changed, 7 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
> index bd79a9718cc7..17edc9a7ff05 100644
> --- a/drivers/gpu/drm/i915/gvt/kvmgt.c
> +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
> @@ -1966,31 +1966,19 @@ static int kvmgt_rw_gpa(unsigned long handle, unsigned long gpa,
> void *buf, unsigned long len, bool write)
> {
> struct kvmgt_guest_info *info;
> - struct kvm *kvm;
> - int idx, ret;
> - bool kthread = current->mm == NULL;
> + int ret;
> + struct intel_vgpu *vgpu;
> + struct device *dev;
>
> if (!handle_valid(handle))
> return -ESRCH;
>
> info = (struct kvmgt_guest_info *)handle;
> - kvm = info->kvm;
> -
> - if (kthread) {
> - if (!mmget_not_zero(kvm->mm))
> - return -EFAULT;
> - use_mm(kvm->mm);
> - }
> -
> - idx = srcu_read_lock(&kvm->srcu);
> - ret = write ? kvm_write_guest(kvm, gpa, buf, len) :
> - kvm_read_guest(kvm, gpa, buf, len);
> - srcu_read_unlock(&kvm->srcu, idx);
> + vgpu = info->vgpu;
> + dev = mdev_dev(vgpu->vdev.mdev);
>
> - if (kthread) {
> - unuse_mm(kvm->mm);
> - mmput(kvm->mm);
> - }
> + ret = write ? vfio_dma_rw(dev, gpa, buf, len, true) :
> + vfio_dma_rw(dev, gpa, buf, len, false);
As Paolo suggested previously, this can be simplified:
ret = vfio_dma_rw(dev, gpa, buf, len, write);
>
> return ret;
Or even more simple, remove the ret variable:
return vfio_dma_rw(dev, gpa, buf, len, write);
Thanks,
Alex
> }