Re: [PATCH 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

From: Tony Krowiak
Date: Wed Feb 10 2021 - 17:08:47 EST




On 2/10/21 10:24 AM, Halil Pasic wrote:
On Wed, 10 Feb 2021 11:53:34 +0100
Cornelia Huck <cohuck@xxxxxxxxxx> wrote:

On Tue, 9 Feb 2021 14:48:30 -0500
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

This patch fixes a circular locking dependency in the CI introduced by
commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
pointer invalidated"). The lockdep only occurs when starting a Secure
Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
SE guests; however, in order to avoid CI errors, this fix is being
provided.

The circular lockdep was introduced when the masks in the guest's APCB
were taken under the matrix_dev->lock. While the lock is definitely
needed to protect the setting/unsetting of the KVM pointer, it is not
necessarily critical for setting the masks, so this will not be done under
protection of the matrix_dev->lock.

Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
---
drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++-------------
1 file changed, 45 insertions(+), 30 deletions(-)
static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
{
- kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
- matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
- vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
- kvm_put_kvm(matrix_mdev->kvm);
- matrix_mdev->kvm = NULL;
+ if (matrix_mdev->kvm) {
If you're doing setting/unsetting under matrix_dev->lock, is it
possible that matrix_mdev->kvm gets unset between here and the next
line, as you don't hold the lock?

Maybe you could
- grab a reference to kvm while holding the lock
- call the mask handling functions with that kvm reference
- lock again, drop the reference, and do the rest of the processing?
I agree, matrix_mdev->kvm can go NULL any time and we are risking
a null pointer dereference here.

Another idea would be to do


static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
{
struct kvm *kvm;
mutex_lock(&matrix_dev->lock);
if (matrix_mdev->kvm) {
kvm = matrix_mdev->kvm;
matrix_mdev->kvm = NULL;
mutex_unlock(&matrix_dev->lock);
kvm_arch_crypto_clear_masks(kvm);
mutex_lock(&matrix_dev->lock);
matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
kvm_put_kvm(kvm);
}
mutex_unlock(&matrix_dev->lock);
}

That way only one unset would actually do the unset and cleanup
and every other invocation would bail out with only checking
matrix_mdev->kvm.

How ironic, that is exactly what I did after agreeing with Connie.


+ kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
+ mutex_lock(&matrix_dev->lock);
+ matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+ vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
+ kvm_put_kvm(matrix_mdev->kvm);
+ matrix_mdev->kvm = NULL;
+ mutex_unlock(&matrix_dev->lock);
+ }
}