Re: [PATCH V3 2/2] virt: sev: Allow for retrying SNP extended requests

From: Dionna Amalie Glaze
Date: Thu Oct 27 2022 - 13:27:38 EST


I think just no to this patch? Reasons below.

>
> if (ghcb->save.sw_exit_info_2) {
> - /* Number of expected pages are returned in RBX */
> + /* For a SNP Extended Request, if the request was placed with
> + * insufficient data pages. The host will return the number of
> + * pages required using RBX in the GHCB. We can than retry the
> + * call as an SNP Request to fulfill the command without getting
> + * the extended request data.
> + */
> if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> - ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN)
> - input->data_npages = ghcb_get_rbx(ghcb);
> -
> - *fw_err = ghcb->save.sw_exit_info_2;
> + ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN) {
> + int npages = ghcb_get_rbx(ghcb);
> +
> + ghcb_clear_rax(ghcb);
> + ghcb_clear_rbx(ghcb);
> +
> + ret = sev_es_ghcb_hv_call(ghcb, &ctxt,
> + SVM_VMGEXIT_GUEST_REQUEST,
> + input->req_gpa,
> + input->resp_gpa);
> + if (ret)
> + goto e_put;
> +

I'm not keen on reissuing the call in this function. I think
issue_request should do its job of sending a request to the host and
returning the specified data, in this case the number of pages in RBX.
I know it's not particularly fun to interpret exitinfo2 in a couple
places, but it serves a purpose. We don't want this function to grow
to have special cases for all the commands that can be sent to the psp
if they don't involve data passed back through the GHCB. The
get_ghcb/put_ghcb frame is the only thing we really need to respect in
here.

The sev-guest device owns the VMPCKn keys, the message sequence
number, and the responsibility of sending a coherent response back to
user space. When we account for the host changing the certificate page
length during the request and not wanting to return to the guest
without completing the firmware call, the length might grow past the
4KiB max constant we have so far. The driver can make the choice of
issuing the request without extended data like you do here, or to
reallocate its cert_data buffer and ask for the extended data again.
It shouldn't matter to the core functionality of issuing a single
request.

When throttling comes into play and retrying needs to happen more than
once, then we're in another situation where the sev-guest driver also
owns the responsibility of trying not to get throttled too hard. My
patches suggest a 2HZ rate limit to avoid any big penalties of running
requests in a tight loop and looking like a DoS antagonist, but that
doesn't belong in arch/x86/kernel/sev.c due to the variability of
strategies.

> + input->data_npages = npages;
> + *fw_err = SNP_GUEST_REQ_INVALID_LEN;
> + } else
> + *fw_err = ghcb->save.sw_exit_info_2;

I think in both branches of the conditional, fw_err gets set to
exit_info_2. See v4 of my throttling patch series.

--
-Dionna Glaze, PhD (she/her)