Re: [PATCH v6 00/10] s390: vfio-ap: dynamic configuration support

From: Tony Krowiak
Date: Tue Oct 15 2019 - 16:33:51 EST

On 10/8/19 8:57 AM, Pierre Morel wrote:

On 10/8/19 12:48 PM, Halil Pasic wrote:
On Fri, 13 Sep 2019 17:26:48 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

The current design for AP pass-through does not support making dynamic
changes to the AP matrix of a running guest resulting in three deficiencies
this patch series is intended to mitigate:

1. Adapters, domains and control domains can not be added to or removed
ÂÂÂ from a running guest. In order to modify a guest's AP configuration,
ÂÂÂ the guest must be terminated; only then can AP resources be assigned
ÂÂÂ to or unassigned from the guest's matrix mdev. The new AP configuration
ÂÂÂ becomes available to the guest when it is subsequently restarted.

2. The AP bus's /sys/bus/ap/apmask and /sys/bus/ap/aqmask interfaces can
ÂÂÂ be modified by a root user without any restrictions. A change to either
ÂÂÂ mask can result in AP queue devices being unbound from the vfio_ap
ÂÂÂ device driver and bound to a zcrypt device driver even if a guest is
ÂÂÂ using the queues, thus giving the host access to the guest's private
ÂÂÂ crypto data and vice versa.

3. The APQNs derived from the Cartesian product of the APIDs of the
ÂÂÂ adapters and APQIs of the domains assigned to a matrix mdev must
ÂÂÂ reference an AP queue device bound to the vfio_ap device driver.

This patch series introduces the following changes to the current design
to alleviate the shortcomings described above as well as to implement more
of the AP architecture:

1. A root user will be prevented from making changes to the AP bus's
ÂÂÂ /sys/bus/ap/apmask or /sys/bus/ap/aqmask if the ownership of an APQN
ÂÂÂ changes from the vfio_ap device driver to a zcrypt driver when the APQN
ÂÂÂ is assigned to a matrix mdev.

2. The sysfs bind/unbind interfaces will be disabled for the vfio_ap
ÂÂÂ device driver.

Doesn't this have the potential of breaking some userspace stuff that
might be out there?

3. Allow AP resources to be assigned to or removed from a matrix mdev
ÂÂÂ while a guest is using it and hot plug the resource into or hot unplug
ÂÂÂ the resource from the running guest.
This looks like a natural extension of the interface -- i.e. should not
break any userspace.

4. Allow assignment of an AP adapter or domain to a matrix mdev even if it
ÂÂÂ results in assignment of an APQN that does not reference an AP queue
ÂÂÂ device bound to the vfio_ap device driver, as long as the APQN is owned
ÂÂÂ by the vfio_ap driver. Allowing over-provisioning of AP resources
ÂÂÂ better models the architecture which does not preclude assigning AP
ÂÂÂ resources that are not yet available in the system. If/when the queue
ÂÂÂ becomes available to the host, it will immediately also become available
ÂÂÂ to the guest.
Same here -- I don't think this change breaks any userspace.

1. Rationale for changes to AP bus's apmask/aqmask interfaces:
Due to the extremely sensitive nature of cryptographic data, it is
imperative that great care be taken to ensure that such data is secured.
Allowing a root user, either inadvertently or maliciously, to configure
these masks such that a queue is shared between the host and a guest is
not only avoidable, it is advisable.

Just curious: how is it possible to do such a configuration?

In the current implementation of dedicated crypto, there is nothing
stopping a sysadmin from changing the apmask/aqmask in manner that
transfers ownership of one more APQNs from the vfio_ap device driver to
zcrypt which results in unbinding the queue devices from vfio_ap and
binding them to the zcrypt drive. If a guest happens to be using the
queue at the time, both the host and guest will have access. That is
fixed by this series.

 It was suggested that this scenario
is better handled in user space with management software, but that does
not preclude a malicious administrator from using the sysfs interfaces
to gain access to a guest's crypto data. It was also suggested that this
scenario could be avoided by taking access to the adapter away from the
guest and zeroing out the queues prior to the vfio_ap driver releasing the
device; however, stealing an adapter in use from a guest as a by-product
of an operation is bad and will likely cause problems for the guest
unnecessarily. It was decided that the most effective solution with the
least number of negative side effects is to prevent the situation at the

Stealing an adapter in use by a guest, insn't it what is done if we allow to unassign an AP/Domain using the unassign sysfs interface when the mediated device is in use by the guest?

Yes, but that is a deliberate action as opposed to a side effect of
bind/unbind. It is the very definition of dynamic configuration (a.k.a.,
hot plug/unplug).

2. Rationale for disabling bind/unbind interfaces for vfio_ap driver:
By disabling the bind/unbind interfaces for the vfio_ap device driver,
the user is forced to use the AP bus's apmask/aqmask interfaces to control
the probing and removing of AP queues. There are two primary reasons for
disabling the bind/unbind interfaces for the vfio_ap device driver:

* The device architecture does not provide a means to prevent unbinding
ÂÂ a device from a device driver, so an AP queue device can be unbound
ÂÂ from the vfio_ap driver even when the queue is in use by a guest. By
ÂÂ disabling the unbind interface, the user is forced to use the AP bus's
ÂÂ apmask/aqmask interfaces which will prevent this.

Isn't this fixed by your filtering (if implemented correctly)? BTW I believe
it solves the problem regardless whether the unbind was triggered by the
drivers unbind attribute or by a[pq]mask

* Binding of AP queues is controlled by the AP bus /sys/bus/ap/apmask and
ÂÂ /sys/bus/ap/aqmask interfaces. If the masks indicate that an APQN is
ÂÂ owned by zcrypt, trying to bind it to the vfio_ap device driver will
ÂÂ fail; therefore, the bind interface is somewhat redundant and certainly
ÂÂ unnecessary.

Tony Krowiak (10):
ÂÂ s390: vfio-ap: Refactor vfio_ap driver probe and remove callbacks
ÂÂ s390: vfio-ap: allow assignment of unavailable AP resources to mdev
ÂÂÂÂ device
ÂÂ s390: vfio-ap: allow hot plug/unplug of AP resources using mdev device
ÂÂ s390: vfio-ap: filter CRYCB bits for unavailable queue devices
ÂÂ s390: vfio-ap: sysfs attribute to display the guest CRYCB
ÂÂ s390: vfio-ap: update guest CRYCB in vfio_ap probe and remove
ÂÂÂÂ callbacks
ÂÂ s390: zcrypt: driver callback to indicate resource in use
ÂÂ s390: vfio-ap: implement in-use callback for vfio_ap driver
ÂÂ s390: vfio-ap: added versioning to vfio_ap module
ÂÂ s390: vfio-ap: update documentation
I believe it would be worthwhile to reorder the patches (fixes and
re-factoring first, the features).


 Documentation/s390/vfio-ap.rst | 899 +++++++++++++++++++++++++---------
 drivers/s390/crypto/ap_bus.c | 144 +++++-
 drivers/s390/crypto/ap_bus.h | 4 +
 drivers/s390/crypto/vfio_ap_drv.c | 27 +-
 drivers/s390/crypto/vfio_ap_ops.c | 610 ++++++++++++++---------
 drivers/s390/crypto/vfio_ap_private.h | 12 +-
 6 files changed, 1200 insertions(+), 496 deletions(-)