Re: [PATCH] s390/vfio-ap: handle response code 01 on queue reset

From: Christian Borntraeger
Date: Wed Nov 29 2023 - 12:13:22 EST


Am 29.11.23 um 15:35 schrieb Tony Krowiak:
In the current implementation, response code 01 (AP queue number not valid)
is handled as a default case along with other response codes returned from
a queue reset operation that are not handled specifically. Barring a bug,
response code 01 will occur only when a queue has been externally removed
from the host's AP configuration; nn this case, the queue must
be reset by the machine in order to avoid leaking crypto data if/when the
queue is returned to the host's configuration. The response code 01 case
will be handled specifically by logging a WARN message followed by cleaning
up the IRQ resources.


To me it looks like this can be triggered by the LPAR admin, correct? So it
is not desireable but possible.
In that case I prefer to not use WARN, maybe use dev_warn or dev_err instead.
WARN can be a disruptive event if panic_on_warn is set.


Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
---
drivers/s390/crypto/vfio_ap_ops.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 4db538a55192..91d6334574d8 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1652,6 +1652,21 @@ static int apq_status_check(int apqn, struct ap_queue_status *status)
* a value indicating a reset needs to be performed again.
*/
return -EAGAIN;
+ case AP_RESPONSE_Q_NOT_AVAIL:
+ /*
+ * This response code indicates the queue is not available.
+ * Barring a bug, response code 01 will occur only when a queue
+ * has been externally removed from the host's AP configuration;
+ * in which case, the queue must be reset by the machine in
+ * order to avoid leaking crypto data if/when the queue is
+ * returned to the host's configuration. In this case, let's go
+ * ahead and log a warning message and return 0 so the AQIC
+ * resources get cleaned up by the caller.
+ */
+ WARN(true,
+ "Unable to reset queue %02x.%04x: not in host AP configuration\n",
+ AP_QID_CARD(apqn), AP_QID_QUEUE(apqn));
+ return 0;
default:
WARN(true,
"failed to verify reset of queue %02x.%04x: TAPQ rc=%u\n",
@@ -1736,6 +1751,22 @@ static void vfio_ap_mdev_reset_queue(struct vfio_ap_queue *q)
q->reset_status.response_code = 0;
vfio_ap_free_aqic_resources(q);
break;
+ case AP_RESPONSE_Q_NOT_AVAIL:
+ /*
+ * This response code indicates the queue is not available.
+ * Barring a bug, response code 01 will occur only when a queue
+ * has been externally removed from the host's AP configuration;
+ * in which case, the queue must be reset by the machine in
+ * order to avoid leaking crypto data if/when the queue is
+ * returned to the host's configuration. In this case, let's go
+ * ahead and log a warning message then clean up the AQIC
+ * resources.
+ */
+ WARN(true,
+ "Unable to reset queue %02x.%04x: not in host AP configuration\n",
+ AP_QID_CARD(q->apqn), AP_QID_QUEUE(q->apqn));
+ vfio_ap_free_aqic_resources(q);
+ break;
default:
WARN(true,
"PQAP/ZAPQ for %02x.%04x failed with invalid rc=%u\n",