Re: [PATCH v2] s390/vfio-ap: fix memory leak in mdev remove callback

From: Christian Borntraeger
Date: Tue May 18 2021 - 13:01:54 EST




On 18.05.21 17:33, Halil Pasic wrote:
On Tue, 18 May 2021 15:59:36 +0200
Christian Borntraeger <borntraeger@xxxxxxxxxx> wrote:

On 18.05.21 15:42, Tony Krowiak wrote:


On 5/18/21 5:30 AM, Christian Borntraeger wrote:


On 17.05.21 21:10, Halil Pasic wrote:
On Mon, 17 May 2021 09:37:42 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

Because of this, I don't think the rest of your argument is valid.

Okay, so your concern is that between the point in time the
vcpu->kvm->arch.crypto.pqap_hook pointer is checked in
priv.c and the point in time the handle_pqap() function
in vfio_ap_ops.c is called, the memory allocated for the
matrix_mdev containing the struct kvm_s390_module_hook
may get freed, thus rendering the function pointer invalid.
While not impossible, that seems extremely unlikely to
happen. Can you articulate a scenario where that could
even occur?

Malicious userspace. We tend to do the pqap aqic just once
in the guest right after the queue is detected. I do agree
it ain't very likely to happen during normal operation. But why are
you asking?

Would it help, if the code in priv.c would read the hook once
and then only work on the copy? We could protect that with rcu
and do a synchronize rcu in vfio_ap_mdev_unset_kvm after
unsetting the pointer?

Unfortunately just "the hook" is ambiguous in this context. We
have kvm->arch.crypto.pqap_hook that is supposed to point to
a struct kvm_s390_module_hook member of struct ap_matrix_mdev
which is also called pqap_hook. And struct kvm_s390_module_hook
has function pointer member named "hook".

I was referring to the full struct.


I'll look into this.

I think it could work. in priv.c use rcu_readlock, save the
pointer, do the check and call, call rcu_read_unlock.
In vfio_ap use rcu_assign_pointer to set the pointer and
after setting it to zero call sychronize_rcu.

In my opinion, we should make the accesses to the
kvm->arch.crypto.pqap_hook pointer properly synchronized. I'm
not sure if that is what you are proposing. How do we usually
do synchronisation on the stuff that lives in kvm->arch?


RCU is a method of synchronization. We make sure that structure
pqap_hook is still valid as long as we are inside the rcu read
lock. So the idea is: clear pointer, wait until all old readers
have finished and the proceed with getting rid of the structure.