Re: [PATCH v2] s390/vfio-ap: fix memory leak in mdev remove callback

From: Halil Pasic
Date: Thu May 13 2021 - 20:15:20 EST


On Thu, 13 May 2021 15:23:27 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

> On 5/13/21 1:45 PM, Halil Pasic wrote:
> > On Thu, 13 May 2021 10:35:05 -0400
> > Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> >
> >> On 5/12/21 2:35 PM, Halil Pasic wrote:
> >>> On Mon, 10 May 2021 17:48:37 -0400
> >>> Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
> >>>
> >>>> The mdev remove callback for the vfio_ap device driver bails out with
> >>>> -EBUSY if the mdev is in use by a KVM guest. The intended purpose was
> >>>> to prevent the mdev from being removed while in use; however, returning a
> >>>> non-zero rc does not prevent removal. This could result in a memory leak
> >>>> of the resources allocated when the mdev was created. In addition, the
> >>>> KVM guest will still have access to the AP devices assigned to the mdev
> >>>> even though the mdev no longer exists.
> >>>>
> >>>> To prevent this scenario, cleanup will be done - including unplugging the
> >>>> AP adapters, domains and control domains - regardless of whether the mdev
> >>>> is in use by a KVM guest or not.
> >>>>
> >>>> Fixes: 258287c994de ("s390: vfio-ap: implement mediated device open callback")
> >>>> Cc: stable@xxxxxxxxxxxxxxx
> >>>> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxx>
> >>>> ---
> >>>> drivers/s390/crypto/vfio_ap_ops.c | 13 ++-----------
> >>>> 1 file changed, 2 insertions(+), 11 deletions(-)
> >>>>
> >>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> >>>> index b2c7e10dfdcd..f90c9103dac2 100644
> >>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
> >>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> >>>> @@ -26,6 +26,7 @@
> >>>>
> >>>> static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
> >>>> static struct vfio_ap_queue *vfio_ap_find_queue(int apqn);
> >>>> +static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev);
> >>>>
> >>>> static int match_apqn(struct device *dev, const void *data)
> >>>> {
> >>>> @@ -366,17 +367,7 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
> >>>> struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> >>>>
> >>>> mutex_lock(&matrix_dev->lock);
> >>>> -
> >>>> - /*
> >>>> - * If the KVM pointer is in flux or the guest is running, disallow
> >>>> - * un-assignment of control domain.
> >>>> - */
> >>>> - if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
> >>>> - mutex_unlock(&matrix_dev->lock);
> >>>> - return -EBUSY;
> >>>> - }
> >>>> -
> >>>> - vfio_ap_mdev_reset_queues(mdev);
> >>>> + vfio_ap_mdev_unset_kvm(matrix_mdev);
> >>>> list_del(&matrix_mdev->node);
> >>>> kfree(matrix_mdev);
> >>> Are we at risk of handle_pqap() in arch/s390/kvm/priv.c using an
> >>> already freed pqap_hook (which is a member of the matrix_mdev pointee
> >>> that is freed just above my comment).
> >>>
> >>> I'm aware of the fact that vfio_ap_mdev_unset_kvm() does a
> >>> matrix_mdev->kvm->arch.crypto.pqap_hook = NULL but that is
> >>> AFRICT not done under any lock relevant for handle_pqap(). I guess
> >>> the idea is, I guess, the check cited below
> >>>
> >>> static int handle_pqap(struct kvm_vcpu *vcpu)
> >>> [..]
> >>> /*
> >>> * Verify that the hook callback is registered, lock the owner
> >>> * and call the hook.
> >>> */
> >>> if (vcpu->kvm->arch.crypto.pqap_hook) {
> >>> if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
> >>> return -EOPNOTSUPP;
> >>> ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
> >>> module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
> >>> if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
> >>> kvm_s390_set_psw_cc(vcpu, 3);
> >>> return ret;
> >>> }
> >>>
> >>> is going to catch it, but I'm not sure it is guaranteed to catch it.
> >>> Opinions?
> >> The hook itself - handle_pqap() function in vfio_ap_ops.c - also checks
> >> to see if the reference to the hook is set and terminates with an error
> >> if it
> >> is not. If the hook is invoked subsequent to the remove callback above,
> >> all should be fine since the check is also done under the matrix_dev->lock.
> >>
> > I don't quite understand your logic. Let us assume matrix_mdev was freed,
> > but vcpu->kvm->arch.crypto.pqap_hook still points to what used to be
> > (*matrix_mdev).pqap_hook. In that case the function pointer
> > vcpu->kvm->arch.crypto.pqap_hook->hook is used after it was freed, and
> > may not point to the handle_pqap() function in vfio_ap_ops.c, thus the
> > check you are referring to ain't necessarily relevant. Than is
> > if you mean the check in the handle_pqap() function in vfio_ap_ops.c; if
> > you mean the check in handle_pqap() in arch/s390/kvm/priv.c, that one is
> > not done under the matrix_dev->lock. Or do I have a hole somewhere in my
> > reasoning?
>
> What I am saying is the vcpu->kvm->arch.crypto.pqap_hook
> will either be NULL or point to the handle_pqap() function in the
> vfio_ap driver.

Please read the code again. In my reading of the code
vcpu->kvm->arch.crypto.pqap_hook is never supposed to point to >(or does
point to) the handle_pqap() function defined in vfio_ap_ops.c. It points
to the pqap_hook member of struct ap_matrix_mdev (the type of the member
is struct kvm_s390_module_hook, which in turn has a function pointer
member called hook, which is supposed to hold the address of
handle_pqap() function defined in vfio_ap_ops.c, and thus point to
it).

Because of this, I don't think the rest of your argument is valid.
Furthermore I believe we first need to get to common ground on this
one before proceeding any further. If you happen to preserve your
opinion after checking again, I think we should try to discuss this
offline, as one of us is likely looking at the wrong code.

Regards,
Halil

> In the latter case, the handler in the driver will get
> called and try to acquire the matrix_dev->lock. The function that
> sets the vcpu->kvm->arch.crypto.pqap_hook to NULL also takes that
> lock. If the pointer is still active, then the handler will do its thing.
> If not, then the handler will return without enabling or disabling
> IRQs. That should not be a problem since the unset_kvm function
> resets the queues which will disable the IRQs.
>
> I don't see how
> the vcpu->kvm->arch.crypto.pqap_hook can point to anything
> other than the handler or be NULL unless KVM is gone. Based on
> my observations of the behavior, unless there is some
> other way for the remove callback to be invoked other than in
> response to a request from userspace via the sysfs remove
> attribute, it will not get called until the file descriptor is
> closed in which case the release callback will also unset_kvm.
> I think you are worrying about something that will likely never
> happen.
>
> >
> > Regards,
> > Halil
> >
>