Re: [PATCH 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

From: Halil Pasic
Date: Wed Feb 10 2021 - 10:26:04 EST


On Wed, 10 Feb 2021 11:53:34 +0100
Cornelia Huck <cohuck@xxxxxxxxxx> wrote:

> On Tue, 9 Feb 2021 14:48:30 -0500
> Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
>
> > This patch fixes a circular locking dependency in the CI introduced by
> > commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
> > pointer invalidated"). The lockdep only occurs when starting a Secure
> > Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
> > SE guests; however, in order to avoid CI errors, this fix is being
> > provided.
> >
> > The circular lockdep was introduced when the masks in the guest's APCB
> > were taken under the matrix_dev->lock. While the lock is definitely
> > needed to protect the setting/unsetting of the KVM pointer, it is not
> > necessarily critical for setting the masks, so this will not be done under
> > protection of the matrix_dev->lock.
> >
> > Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
> > ---
> > drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++-------------
> > 1 file changed, 45 insertions(+), 30 deletions(-)
> >
>
> > static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> > {
> > - kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> > - matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> > - vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> > - kvm_put_kvm(matrix_mdev->kvm);
> > - matrix_mdev->kvm = NULL;
> > + if (matrix_mdev->kvm) {
>
> If you're doing setting/unsetting under matrix_dev->lock, is it
> possible that matrix_mdev->kvm gets unset between here and the next
> line, as you don't hold the lock?
>
> Maybe you could
> - grab a reference to kvm while holding the lock
> - call the mask handling functions with that kvm reference
> - lock again, drop the reference, and do the rest of the processing?

I agree, matrix_mdev->kvm can go NULL any time and we are risking
a null pointer dereference here.

Another idea would be to do


static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
{
struct kvm *kvm;

mutex_lock(&matrix_dev->lock);
if (matrix_mdev->kvm) {
kvm = matrix_mdev->kvm;
matrix_mdev->kvm = NULL;
mutex_unlock(&matrix_dev->lock);
kvm_arch_crypto_clear_masks(kvm);
mutex_lock(&matrix_dev->lock);
matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
kvm_put_kvm(kvm);
}
mutex_unlock(&matrix_dev->lock);
}

That way only one unset would actually do the unset and cleanup
and every other invocation would bail out with only checking
matrix_mdev->kvm.


> > + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> > + mutex_lock(&matrix_dev->lock);
> > + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> > + vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> > + kvm_put_kvm(matrix_mdev->kvm);
> > + matrix_mdev->kvm = NULL;
> > + mutex_unlock(&matrix_dev->lock);
> > + }
> > }
>