Re: [PATCH] s390/vfio-ap: fix memory leak in mdev remove callback

From: Cornelia Huck
Date: Thu May 06 2021 - 06:23:05 EST


On Wed, 5 May 2021 13:28:26 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

> The mdev remove callback for the vfio_ap device driver bails out with
> -EBUSY if the mdev is in use by a KVM guest. The intended purpose was
> to prevent the mdev from being removed while in use; however, returning a
> non-zero rc does not prevent removal. This could result in a memory leak
> of the resources allocated when the mdev was created. In addition, the
> KVM guest will still have access to the AP devices assigned to the mdev
> even though the mdev no longer exists.
>
> To prevent this scenario, cleanup will be done - including unplugging the
> AP adapters, domains and control domains - regardless of whether the mdev
> is in use by a KVM guest or not.
>
> Fixes: 258287c994de ("s390: vfio-ap: implement mediated device open callback")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxx>
> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 39 +++++++++++++++++++++++--------
> 1 file changed, 29 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index b2c7e10dfdcd..757166da947e 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -335,6 +335,32 @@ static void vfio_ap_matrix_init(struct ap_config_info *info,
> matrix->adm_max = info->apxa ? info->Nd : 15;
> }
>
> +static bool vfio_ap_mdev_has_crycb(struct ap_matrix_mdev *matrix_mdev)
> +{
> + return (matrix_mdev->kvm && matrix_mdev->kvm->arch.crypto.crycbd);
> +}
> +
> +static void vfio_ap_mdev_clear_apcb(struct ap_matrix_mdev *matrix_mdev)
> +{
> + /*
> + * If the KVM pointer is in the process of being set, wait until the
> + * process has completed.
> + */
> + wait_event_cmd(matrix_mdev->wait_for_kvm,
> + !matrix_mdev->kvm_busy,
> + mutex_unlock(&matrix_dev->lock),
> + mutex_lock(&matrix_dev->lock));
> +
> + if (vfio_ap_mdev_has_crycb(matrix_mdev)) {
> + matrix_mdev->kvm_busy = true;
> + mutex_unlock(&matrix_dev->lock);
> + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> + mutex_lock(&matrix_dev->lock);
> + matrix_mdev->kvm_busy = false;
> + wake_up_all(&matrix_mdev->wait_for_kvm);
> + }
> +}

Looking at vfio_ap_mdev_unset_kvm(), do you need to unhook the kvm here
as well?

(Or can you maybe even combine the two functions into one?)

> +
> static int vfio_ap_mdev_create(struct mdev_device *mdev)
> {
> struct ap_matrix_mdev *matrix_mdev;
> @@ -366,16 +392,9 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
> struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>
> mutex_lock(&matrix_dev->lock);
> -
> - /*
> - * If the KVM pointer is in flux or the guest is running, disallow
> - * un-assignment of control domain.
> - */
> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> - mutex_unlock(&matrix_dev->lock);
> - return -EBUSY;
> - }
> -
> + WARN(vfio_ap_mdev_has_crycb(matrix_mdev),
> + "Removing mdev leaves KVM guest without any crypto devices");
> + vfio_ap_mdev_clear_apcb(matrix_mdev);
> vfio_ap_mdev_reset_queues(mdev);
> list_del(&matrix_mdev->node);
> kfree(matrix_mdev);