Re: [PATCH] s390/vfio-ap: Clean up vfio_ap resources when KVM pointer invalidated
From: Halil Pasic
Date: Sun Dec 13 2020 - 17:59:45 EST
On Fri, 11 Dec 2020 15:52:55 -0500
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
>
>
> On 12/7/20 7:01 PM, Halil Pasic wrote:
> > On Mon, 7 Dec 2020 13:50:36 -0500
> > Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> >
> >> On 12/4/20 2:05 PM, Halil Pasic wrote:
> >>> On Fri, 4 Dec 2020 09:43:59 -0500
> >>> Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> >>>
> >>>>>> +{
> >>>>>> + if (matrix_mdev->kvm) {
> >>>>>> + (matrix_mdev->kvm);
> >>>>>> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> >>>>> Is a plain assignment to arch.crypto.pqap_hook apropriate, or do we need
> >>>>> to take more care?
> >>>>>
> >>>>> For instance kvm_arch_crypto_set_masks() takes kvm->lock before poking
> >>>>> kvm->arch.crypto.crycb.
> >>>> I do not think so. The CRYCB is used by KVM to provide crypto resources
> >>>> to the guest so it makes sense to protect it from changes to it while
> >>>> passing
> >>>> the AP devices through to the guest. The hook is used only when an AQIC
> >>>> executed on the guest is intercepted by KVM. If the notifier
> >>>> is being invoked to notify vfio_ap that KVM has been set to NULL, this means
> >>>> the guest is gone in which case there will be no AP instructions to
> >>>> intercept.
> >>> If the update to pqap_hook isn't observed as atomic we still have a
> >>> problem. With torn writes or reads we would try to use a corrupt function
> >>> pointer. While the compiler probably ain't likely to generate silly code
> >>> for the above assignment (multiple write instructions less then
> >>> quadword wide), I know of nothing that would prohibit the compiler to do
> >>> so.
> >> I'm sorry, but I still don't understand why you tkvm_vfio_group_set_kvmhink this is a problem
> >> given what I stated above.
> > I assume you are specifically referring to 'the guest is gone in which
> > case there will be no AP instructions to intercept'. I assume by 'guest
> > is gone' you mean that the VM is being destroyed, and the vcpus are out
> > of SIE. You are probably right for the invocation of
> > kvm_vfio_group_set_kvm() in kvm_vfio_destroy(), but is that true for
> > the invocation in the KVM_DEV_VFIO_GROUP_DEL case in
> > kvm_vfio_set_group()? I.e. can't we get the notifier called when the
> > qemu device is hot unplugged (modulo remove which unregisters the
> > notifier and usually precludes the notifier being with NULL called at
> > all)?
>
> I am assuming by your question that the qemu device you are
> talking about the '-device vfio-ap' specified on the qemu command
> line or attached vi||||||a qemu device_add.
Yes.
> When an mdev is hot
> unplugged, the
> vfio_ap driver's release callback gets invoked when the mdev fd is
> closed. The
> release callback unregisters the notifier, so it does not get called
> when the guest subsequently shuts down.
>
That is what I meant by 'modulo remove which unregisters the notifier
and usually precludes the notifier being with NULL called at all', but
unfortunately I mixed up remove and release.
AFAIU release should be called before the notifier gets invoked
regardless of whether we have a hot-unplug of '-device vfio-ap' or
a shutdown. The whole effort is about what happens if userspace does
not adhered to this. If I apply the logic of your last response to the
whole situation, then there is nothing to do (AFAIU).
The point I'm trying to make is, that in a case of the hot-unplug, the
guest may survive the call to the notifier and also the vfio_mdev device
it was associated to at some point. So your argument that 'the guest is
gone in which case there will be no AP instructions to interpret' does
not hold.
Regards,
Halil