[PATCH v15 1/4] virt/coco/sev-guest: Add throttling awareness
From: Dionna Glaze
Date: Tue Feb 14 2023 - 11:49:01 EST
The host is permitted and encouraged to throttle guest requests to the
AMD-SP since it is a shared resource across all VMs. Without
throttling-awareness, the host returning an error will immediately lock
out access to the VMPCK, which makes the VM less useful as it can't
attest itself. Since throttling is expected for a host to protect itself
from an uncooperative guest, a cooperative host can return a VMM error
code that the request was throttled.
The driver interprets the upper 32 bits of exitinfo2 as a VMM error code.
For safety, since the encryption algorithm in GHCBv2 is AES_GCM, control
must remain in the kernel to complete the request with the current
sequence number. Returning without finishing the request allows the
guest to make another request but with different message contents. This
is IV reuse, and breaks cryptographic protections.
A quick fix is to retry for a while and then disable the VMPCK and
return to user space.
A guest request may not make it to the AMD-SP before the host returns to
the guest, so the err local variable in handle_guest_request must be
initialized the same way fw_err is. snp_issue_guest_request similarly
should set fw_err whether or not the value is non-zero, in order to
appropriately clear the error value when zero.
The IV reuse fix for invalid certs_len needs modification to work with
throttling, since a single retry with a modified exit_code may be
throttled without retry and result in a locked-out VMPCK. Instead,
change the exit_code as before and jump to the same retry label, and
deal with the error code fixup by checking if the exit_code had to be
changed.
Cc: Tom Lendacky <Thomas.Lendacky@xxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Joerg Roedel <jroedel@xxxxxxx>
Cc: Peter Gonda <pgonda@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Borislav Petkov <Borislav.Petkov@xxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Venu Busireddy <venu.busireddy@xxxxxxxxxx>
Cc: Michael Roth <michael.roth@xxxxxxx>
Cc: "Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx>
Cc: Michael Sterritt <sterritt@xxxxxxxxxx>
Fixes: d5af44dde546 ("x86/sev: Provide support for SNP guest request NAEs")
Signed-off-by: Dionna Glaze <dionnaglaze@xxxxxxxxxx>
---
arch/x86/include/asm/sev-common.h | 3 ++-
arch/x86/kernel/sev.c | 3 +--
drivers/virt/coco/sev-guest/sev-guest.c | 34 ++++++++++++++++++++++---
3 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index b8357d6ecd47..b63be696b776 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -128,8 +128,9 @@ struct snp_psc_desc {
struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
} __packed;
-/* Guest message request error code */
+/* Guest message request error codes */
#define SNP_GUEST_REQ_INVALID_LEN BIT_ULL(32)
+#define SNP_GUEST_REQ_ERR_BUSY BIT_ULL(33)
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 679026a640ef..a908ffc2dfba 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2212,14 +2212,13 @@ int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned
if (ret)
goto e_put;
+ *fw_err = ghcb->save.sw_exit_info_2;
if (ghcb->save.sw_exit_info_2) {
/* Number of expected pages are returned in RBX */
if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN)
input->data_npages = ghcb_get_rbx(ghcb);
- *fw_err = ghcb->save.sw_exit_info_2;
-
ret = -EIO;
}
diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
index 4ec4174e05a3..dc75f11c086e 100644
--- a/drivers/virt/coco/sev-guest/sev-guest.c
+++ b/drivers/virt/coco/sev-guest/sev-guest.c
@@ -30,6 +30,8 @@
#define DEVICE_NAME "sev-guest"
#define AAD_LEN 48
#define MSG_HDR_VER 1
+#define ACCEPTABLE_REQUEST_RETRY_DURATION (60*HZ)
+#define REQUEST_RETRY_DELAY (2*HZ)
struct snp_guest_crypto {
struct crypto_aead *tfm;
@@ -322,9 +324,12 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
u8 type, void *req_buf, size_t req_sz, void *resp_buf,
u32 resp_sz, __u64 *fw_err)
{
- unsigned long err;
+ unsigned long err = 0xff;
+ unsigned long start_time = jiffies;
+ u64 orig_exit_code = exit_code;
u64 seqno;
int rc;
+ unsigned int certs_npages = 0;
/* Get message sequence and verify that its a non-zero */
seqno = snp_get_msg_seqno(snp_dev);
@@ -338,6 +343,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
if (rc)
return rc;
+retry:
/*
* Call firmware to process the request. In this function the encrypted
* message enters shared memory with the host. So after this call the
@@ -346,6 +352,24 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
*/
rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ /*
+ * The host may return SNP_GUEST_REQ_ERR_EBUSY if the request has been
+ * throttled. Retry in the driver to avoid returning and reusing the
+ * message sequence number on a different message.
+ */
+ if (err == SNP_GUEST_REQ_ERR_BUSY) {
+ if (jiffies - start_time > ACCEPTABLE_REQUEST_RETRY_DURATION) {
+ rc = -ETIMEDOUT;
+ /*
+ * Must disable VMPCK since it's not the user's
+ * responsibility to avoid IV reuse.
+ */
+ goto disable_vmpck;
+ }
+ schedule_timeout_killable(REQUEST_RETRY_DELAY);
+ goto retry;
+ }
+
/*
* If the extended guest request fails due to having too small of a
* certificate data buffer, retry the same guest request without the
@@ -354,7 +378,7 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
*/
if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
err == SNP_GUEST_REQ_INVALID_LEN) {
- const unsigned int certs_npages = snp_dev->input.data_npages;
+ certs_npages = snp_dev->input.data_npages;
exit_code = SVM_VMGEXIT_GUEST_REQUEST;
@@ -366,8 +390,12 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
* of the VMPCK and the error code being propagated back to the
* user as an ioctl() return code.
*/
- rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ schedule_timeout_killable(REQUEST_RETRY_DELAY);
+ goto retry;
+ }
+ if (orig_exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
+ exit_code != orig_exit_code) {
/*
* Override the error to inform callers the given extended
* request buffer size was too small and give the caller the
--
2.39.1.637.g21b0678d19-goog