Re: [PATCHv2 08/10] vfio/mdev: Improve the create/remove sequence

From: Pierre Morel
Date: Thu May 09 2019 - 12:28:21 EST

On 09/05/2019 11:06, Cornelia Huck wrote:
[vfio-ap folks: find a question regarding removal further down]

On Wed, 8 May 2019 22:06:48 +0000
Parav Pandit <parav@xxxxxxxxxxxx> wrote:

-----Original Message-----
From: Cornelia Huck <cohuck@xxxxxxxxxx>
Sent: Wednesday, May 8, 2019 12:10 PM
To: Parav Pandit <parav@xxxxxxxxxxxx>
Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
kwankhede@xxxxxxxxxx; alex.williamson@xxxxxxxxxx; cjia@xxxxxxxxxx
Subject: Re: [PATCHv2 08/10] vfio/mdev: Improve the create/remove

On Tue, 30 Apr 2019 17:49:35 -0500
Parav Pandit <parav@xxxxxxxxxxxx> wrote:


@@ -373,16 +330,15 @@ int mdev_device_remove(struct device *dev,
bool force_remove)

type = to_mdev_type(mdev->type_kobj);
+ mdev_remove_sysfs_files(dev, type);
+ device_del(&mdev->dev);
parent = mdev->parent;
+ ret = parent->ops->remove(mdev);
+ if (ret)
+ dev_err(&mdev->dev, "Remove failed: err=%d\n", ret);

I think carrying on with removal regardless of the return code of the
->remove callback makes sense, as it simply matches usual practice.
However, are we sure that every vendor driver works well with that? I think
it should, as removal from bus unregistration (vs. from the sysfs
file) was always something it could not veto, but have you looked at the
individual drivers?
I looked at following drivers a little while back.
Looked again now.

drivers/gpu/drm/i915/gvt/kvmgt.c which clears the handle valid in intel_vgpu_release(), which should finish first before remove() is invoked.

s390 vfio_ccw_mdev_remove() driver drivers/s390/cio/vfio_ccw_ops.c remove() always returns 0.
s39 crypo fails the remove() once vfio_ap_mdev_release marks kvm null, which should finish before remove() is invoked.

That one is giving me a bit of a headache (the ->kvm reference is
supposed to keep us from detaching while a vm is running), so let's cc:
the vfio-ap maintainers to see whether they have any concerns.

We are aware of this race and we did correct this in the IRQ patches for which it would have become a real issue.
We now increment/decrement the KVM reference counter inside open and release.
Should be right after this.

Thanks for the cc,

Pierre Morel
Linux/KVM/QEMU in BÃblingen - Germany