Re: [PATCH 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

From: Tony Krowiak
Date: Wed Feb 10 2021 - 15:35:37 EST




On 2/10/21 5:53 AM, Cornelia Huck wrote:
On Tue, 9 Feb 2021 14:48:30 -0500
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

This patch fixes a circular locking dependency in the CI introduced by
commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
pointer invalidated"). The lockdep only occurs when starting a Secure
Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
SE guests; however, in order to avoid CI errors, this fix is being
provided.

The circular lockdep was introduced when the masks in the guest's APCB
were taken under the matrix_dev->lock. While the lock is definitely
needed to protect the setting/unsetting of the KVM pointer, it is not
necessarily critical for setting the masks, so this will not be done under
protection of the matrix_dev->lock.

Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
---
drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++-------------
1 file changed, 45 insertions(+), 30 deletions(-)

static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
{
- kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
- matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
- vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
- kvm_put_kvm(matrix_mdev->kvm);
- matrix_mdev->kvm = NULL;
+ if (matrix_mdev->kvm) {
If you're doing setting/unsetting under matrix_dev->lock, is it
possible that matrix_mdev->kvm gets unset between here and the next
line, as you don't hold the lock?

That is highly unlikely because the only place the matrix_mdev->kvm
pointer is cleared is in this function which is called from only two
places: the notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM
notification when the KVM pointer is cleared; the vfio_ap_mdev_release()
function which is called when the mdev fd is closed (i.e., when the guest
is shut down). The fact is, with the only end-to-end implementation
currently available, the notifier callback is never invoked to clear
the KVM pointer because the vfio_ap_mdev_release callback is
invoked first and it unregisters the notifier callback.

Having said that, I suppose there is no guarantee that there will not
be different userspace clients in the future that do things in a
different order. At the very least, it wouldn't hurt to protect against
that as you suggest below.


Maybe you could
- grab a reference to kvm while holding the lock
- call the mask handling functions with that kvm reference
- lock again, drop the reference, and do the rest of the processing?

+ kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
+ mutex_lock(&matrix_dev->lock);
+ matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+ vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
+ kvm_put_kvm(matrix_mdev->kvm);
+ matrix_mdev->kvm = NULL;
+ mutex_unlock(&matrix_dev->lock);
+ }
}