Fixes a memory leak in the mdev remove callback when invoked while the
mdev is in use by a KVM guest. Instead of returning -EBUSY from the
callback, a full cleanup of the resources allocated to the mdev is
performed because regardless of the value returned from the function, the
mdev is removed from sysfs.
The cleanup of resources allocated to the mdev may coincide with the
interception of the PQAP(AQIC) instruction in which case data needed to
handle the interception may get removed. A patch is included in this series
to synchronize access to resources needed by the interception handler to
protect against invalid memory accesses.
The first pass (PATCH v3) at trying to synchronize access to the pqap
function pointer employed RCU. The problem is, the RCU read-side critical
section would have to include the execution of the pqap function which
sleeps; RCU disallows sleeping inside an RCU region. When I subsequently
tried to encompass the pqap function within the
rcu_read_lock/rcu_read_unlock, I ended up seeing lockdep warnings in the
syslog.
It was suggested that we use an rw_semaphore to synchronize access to
the pqap hook, but I also ran into similar lockdep complaints something
like the following:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
down_read(&rwsem)
in handle_pqap (priv.c);
lock(&matrix_dev->lock);
in vfio_ap_mdev_set_kvm (vfio_ap_ops.c)
down_write(&rwsem;
in vfio_ap_mdev_set_kvm (vfio_ap_ops.c)
lock(&matrix_dev->lock);
in handle_pqap(vfio_ap_ops.c)
Access to the mdev must be done under the matrix_dev->lock to ensure that
it doesn't get freed via the remove callback while in use. This appears
to be mutually exclusive with setting/unsetting the pqap_hook pointer
due to lockdep issues.
The solution:
------------
The lifetime of the handle_pqap function (vfio_ap_ops) is syncrhonous
with the lifetime of the vfio_ap module, so there really is not reason
to tie the setting/clearing of its function pointer with the lifetime
of a guest or even an mdev. If the function pointer is set when the
vfio_ap module is loaded and cleared when the vfio_ap module is unloaded,
then access to it can be protected independently from mdev creation or
removal as well as the starting or shutdown of a guest. As long as
access to the mdev is always controlled by the matrix_dev->lock, the
mdev can not be freed without other functions being aware.
Change log:
v3 -> v4:
--------
* Created a registry for crypto hooks in priv.c with functions for
registering/unregistering function pointers in kvm_host.h (for s390).
* Register the function pointer for handling the PQAP instruction when
the vfio_ap module is loaded and unregister it when the module is
unloaded.