Re: [PATCH] s390/vfio-ap: Clean up vfio_ap resources when KVM pointer invalidated

From: Halil Pasic
Date: Fri Dec 04 2020 - 16:55:09 EST


On Fri, 4 Dec 2020 14:46:30 -0500
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

> On 12/4/20 2:05 PM, Halil Pasic wrote:
> > On Fri, 4 Dec 2020 09:43:59 -0500
> > Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> >
> >>>> +{
> >>>> + if (matrix_mdev->kvm) {
> >>>> + (matrix_mdev->kvm);
> >>>> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> >>> Is a plain assignment to arch.crypto.pqap_hook apropriate, or do we need
> >>> to take more care?
> >>>
> >>> For instance kvm_arch_crypto_set_masks() takes kvm->lock before poking
> >>> kvm->arch.crypto.crycb.
> >> I do not think so. The CRYCB is used by KVM to provide crypto resources
> >> to the guest so it makes sense to protect it from changes to it while
> >> passing
> >> the AP devices through to the guest. The hook is used only when an AQIC
> >> executed on the guest is intercepted by KVM. If the notifier
> >> is being invoked to notify vfio_ap that KVM has been set to NULL, this means
> >> the guest is gone in which case there will be no AP instructions to
> >> intercept.
> > If the update to pqap_hook isn't observed as atomic we still have a
> > problem. With torn writes or reads we would try to use a corrupt function
> > pointer. While the compiler probably ain't likely to generate silly code
> > for the above assignment (multiple write instructions less then
> > quadword wide), I know of nothing that would prohibit the compiler to do
> > so.
>
> I see that in the handle_pqap() function in arch/s390/kvm/priv.c
> that gets called when the AQIC instruction is intercepted,
> the pqap_hook is protected by locking the owner of the hook:
>
>         if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
>             return -EOPNOTSUPP;
>         ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
> module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>
> Maybe that is what we should do when the kvm->arch.crypto.pqap_hook
> is set to NULL?

To my best knowledge that ain't no locking but mere refcounting. The
purpose of that is probably to prevent the owner module, and the code
pointed to by the 'hook' function pointer from being unloaded while we
are executing that very same code.

Why is that necessary, frankly I have no idea. We do tend to invalidate
the callback before doing our module_put in vfio_ap_mdev_release(). Maybe
the case you are handling right now is the reason (because the
callback is invalidated in vfio_ap_mdev_release() only if !!kvm.

Regards,
Halil