Re: [PATCH v9 17/22] s390: vfio-ap: zeroize the AP queues.

From: Cornelia Huck
Date: Fri Aug 17 2018 - 05:35:04 EST


On Wed, 15 Aug 2018 16:36:32 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

> On 08/15/2018 12:24 PM, Cornelia Huck wrote:
> > On Mon, 13 Aug 2018 17:48:14 -0400
> > Tony Krowiak <akrowiak@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Nit: please drop the leading period in the subject.
>
> I assume you mean the ending period?

Err, of course.

>
> >
> >> From: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
> >>
> >> Let's call PAPQ(ZAPQ) to zeroize a queue:
> >>
> >> * For each queue configured for a mediated matrix device
> >> when it is released.
> >>
> >> Zeroizing a queue resets the queue, clears all pending
> >> messages for the queue entries and disables adapter interruptions
> >> associated with the queue.
> >>
> >> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
> >> Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
> >> Tested-by: Michael Mueller <mimu@xxxxxxxxxxxxx>
> >> Tested-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
> >> Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> >> ---
> >> drivers/s390/crypto/vfio_ap_ops.c | 25 +++++++++++++++++++++++++
> >> drivers/s390/crypto/vfio_ap_private.h | 25 +++++++++++++++++++++++++
> >> 2 files changed, 50 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> >> index 3e8534b..34f982a 100644
> >> --- a/drivers/s390/crypto/vfio_ap_private.h
> >> +++ b/drivers/s390/crypto/vfio_ap_private.h
> >> @@ -74,4 +74,29 @@ struct ap_matrix_mdev {
> >> extern int vfio_ap_mdev_register(void);
> >> extern void vfio_ap_mdev_unregister(void);
> >>
> >> +static inline int vfio_ap_reset_queue(unsigned int apid, unsigned int apqi,
> >> + unsigned int retry)
> >> +{
> >> + struct ap_queue_status status;
> >> +
> >> + do {
> >> + status = ap_zapq(AP_MKQID(apid, apqi));
> >> + switch (status.response_code) {
> >> + case AP_RESPONSE_NORMAL:
> >> + return 0;
> >> + case AP_RESPONSE_RESET_IN_PROGRESS:
> >> + case AP_RESPONSE_BUSY:
> >> + msleep(20);
> >> + break;
> >> + default:
> >> + pr_warn("%s: error zeroizing %02x.%04x: response code %d\n",
> >> + VFIO_AP_MODULE_NAME, apid, apqi,
> >> + status.response_code);
> > How can we end up here? Does this mean that we just don't know what to
> > do with this response, or is this something that should never happen?
> > (How much sense does it make to print an error?)
>
> There are additional response codes that could be returned; for example,
> in the case of a catastrophic failure such as: The function can not be
> performed because the AP was somehow deconfigured or the functiona
> cannot be performed due to a machine check failure that caused the AP
> path to be removed. It shouldn't happen, but all are possibilities.
> I can get rid of the message and just return -EIO if you prefer.

These sound like "ugh, we're broken anyway". Not sure if an additional
message would help here much; I'd expect other code to just handle the
failure (especially things like machine checks). I would not oppose
removing the message :)

Maybe add a comment /* things are really broken, give up */ instead?

>
> >
> >> + return -EIO;
> >> + }
> >> + } while (retry--);
> >> +
> >> + return -EBUSY;
> >> +}
> >> +
> >> #endif /* _VFIO_AP_PRIVATE_H_ */
>
>