Re: [PATCH 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

From: Cornelia Huck
Date: Thu Feb 11 2021 - 07:42:38 EST


On Wed, 10 Feb 2021 15:34:24 -0500
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

> On 2/10/21 5:53 AM, Cornelia Huck wrote:
> > On Tue, 9 Feb 2021 14:48:30 -0500
> > Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> >
> >> This patch fixes a circular locking dependency in the CI introduced by
> >> commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
> >> pointer invalidated"). The lockdep only occurs when starting a Secure
> >> Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
> >> SE guests; however, in order to avoid CI errors, this fix is being
> >> provided.
> >>
> >> The circular lockdep was introduced when the masks in the guest's APCB
> >> were taken under the matrix_dev->lock. While the lock is definitely
> >> needed to protect the setting/unsetting of the KVM pointer, it is not
> >> necessarily critical for setting the masks, so this will not be done under
> >> protection of the matrix_dev->lock.
> >>
> >> Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
> >> Cc: stable@xxxxxxxxxxxxxxx
> >> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
> >> ---
> >> drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++-------------
> >> 1 file changed, 45 insertions(+), 30 deletions(-)
> >>
> >> static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> >> {
> >> - kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> >> - matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> >> - vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> >> - kvm_put_kvm(matrix_mdev->kvm);
> >> - matrix_mdev->kvm = NULL;
> >> + if (matrix_mdev->kvm) {
> > If you're doing setting/unsetting under matrix_dev->lock, is it
> > possible that matrix_mdev->kvm gets unset between here and the next
> > line, as you don't hold the lock?
>
> That is highly unlikely because the only place the matrix_mdev->kvm
> pointer is cleared is in this function which is called from only two
> places: the notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM
> notification when the KVM pointer is cleared; the vfio_ap_mdev_release()
> function which is called when the mdev fd is closed (i.e., when the guest
> is shut down). The fact is, with the only end-to-end implementation
> currently available, the notifier callback is never invoked to clear
> the KVM pointer because the vfio_ap_mdev_release callback is
> invoked first and it unregisters the notifier callback.
>
> Having said that, I suppose there is no guarantee that there will not
> be different userspace clients in the future that do things in a
> different order. At the very least, it wouldn't hurt to protect against
> that as you suggest below.

Yes, if userspace is able to use the interfaces in the certain way, we
should always make sure that nothing bad happens if it does so, even if
known userspace applications are well-behaved.

[Can we make an 'evil userspace' test program, maybe? The hardware
dependency makes this hard to run, though.]

>
> >
> > Maybe you could
> > - grab a reference to kvm while holding the lock
> > - call the mask handling functions with that kvm reference
> > - lock again, drop the reference, and do the rest of the processing?
> >
> >> + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> >> + mutex_lock(&matrix_dev->lock);
> >> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> >> + vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> >> + kvm_put_kvm(matrix_mdev->kvm);
> >> + matrix_mdev->kvm = NULL;
> >> + mutex_unlock(&matrix_dev->lock);
> >> + }
> >> }
>