On Mon, Sep 19, 2022 at 5:47 PM Ashish Kalra <ashkalra@xxxxxxx> wrote:
On 9/19/22 22:18, Tom Lendacky wrote:
On 9/19/22 17:02, Alper Gun wrote:
On Mon, Sep 19, 2022 at 2:38 PM Tom Lendacky
<thomas.lendacky@xxxxxxx> wrote:
On 9/19/22 12:53, Alper Gun wrote:
On Fri, Aug 19, 2022 at 9:54 AM Peter Gonda <pgonda@xxxxxxxxxx> wrote:
+
+static int __snp_handle_page_state_change(struct kvm_vcpu *vcpu,
enum psc_op op, gpa_t gpa,
+ int level)
+{
+ struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
+ struct kvm *kvm = vcpu->kvm;
+ int rc, npt_level;
+ kvm_pfn_t pfn;
+ gpa_t gpa_end;
+
+ gpa_end = gpa + page_level_size(level);
+
+ while (gpa < gpa_end) {
+ /*
+ * If the gpa is not present in the NPT then
build the NPT.
+ */
+ rc = snp_check_and_build_npt(vcpu, gpa, level);
+ if (rc)
+ return -EINVAL;
+
+ if (op == SNP_PAGE_STATE_PRIVATE) {
+ hva_t hva;
+
+ if (snp_gpa_to_hva(kvm, gpa, &hva))
+ return -EINVAL;
+
+ /*
+ * Verify that the hva range is
registered. This enforcement is
+ * required to avoid the cases where a
page is marked private
+ * in the RMP table but never gets
cleanup during the VM
+ * termination path.
+ */
+ mutex_lock(&kvm->lock);
+ rc = is_hva_registered(kvm, hva,
page_level_size(level));
+ mutex_unlock(&kvm->lock);
+ if (!rc)
+ return -EINVAL;
+
+ /*
+ * Mark the userspace range unmerable
before adding the pages
+ * in the RMP table.
+ */
+ mmap_write_lock(kvm->mm);
+ rc = snp_mark_unmergable(kvm, hva,
page_level_size(level));
+ mmap_write_unlock(kvm->mm);
+ if (rc)
+ return -EINVAL;
+ }
+
+ write_lock(&kvm->mmu_lock);
+
+ rc = kvm_mmu_get_tdp_walk(vcpu, gpa, &pfn,
&npt_level);
+ if (!rc) {
+ /*
+ * This may happen if another vCPU
unmapped the page
+ * before we acquire the lock. Retry the
PSC.
+ */
+ write_unlock(&kvm->mmu_lock);
+ return 0;
+ }
I think we want to return -EAGAIN or similar if we want the caller to
retry, right? I think returning 0 here hides the error.
The problem here is that the caller(linux guest kernel) doesn't retry
if PSC fails. The current implementation in the guest kernel is that
if a page state change request fails, it terminates the VM with
GHCB_TERM_PSC reason.
Returning 0 here is not a good option because it will fail the PSC
silently and will probably cause a nested RMP fault later. Returning
Returning 0 here is ok because the PSC current index into the PSC
structure will not be updated and the guest will then retry (see the
loop
in vmgexit_psc() in arch/x86/kernel/sev.c).
Thanks,
Tom
But the host code updates the index. It doesn't leave the loop because
rc is 0. The guest will think that it is successful.
rc = __snp_handle_page_state_change(vcpu, op, gpa, level);
if (rc)
goto out;
Also the page state change request with MSR is not retried. It
terminates the VM if the MSR request fails.
Ah, right. I see what you mean. It should probably return a -EAGAIN
instead of 0 and then the if (rc) check should be modified to
specifically look for -EAGAIN and goto out after setting rc to 0.
But that does leave the MSR protocol open to the problem that you
mention, so, yes, retry logic in snp_handle_page_state_change() for a
-EAGAIN seems reasonable.
Thanks,
Tom
I believe it makes more sense to add the retry logic within
__snp_handle_page_state_change() itself, as that will make it work for
both the GHCB MSR protocol and the GHCB VMGEXIT requests.
You are suggesting we just retry 'kvm_mmu_get_tdp_walk' inside of
__snp_handle_page_state_change()? That should work but how many times
do we retry? If we return EAGAIN or error we can leave it up to the
caller
an error also terminates the guest immediately with current guest
implementation. I think the best approach here is adding a retry logic
to this function. Retrying without returning an error should help it
work because snp_check_and_build_npt will be called again and in the
second attempt this should work.
+
+ /*
+ * Adjust the level so that we don't go higher
than the backing
+ * page level.
+ */
+ level = min_t(size_t, level, npt_level);
+
+ trace_kvm_snp_psc(vcpu->vcpu_id, pfn, gpa, op,
level);
+
+ switch (op) {
+ case SNP_PAGE_STATE_SHARED:
+ rc = snp_make_page_shared(kvm, gpa, pfn,
level);
+ break;
+ case SNP_PAGE_STATE_PRIVATE:
+ rc = rmp_make_private(pfn, gpa, level,
sev->asid, false);
+ break;
+ default:
+ rc = -EINVAL;
+ break;
+ }
+
+ write_unlock(&kvm->mmu_lock);
+
+ if (rc) {
+ pr_err_ratelimited("Error op %d gpa %llx
pfn %llx level %d rc %d\n",
+ op, gpa, pfn, level, rc);
+ return rc;
+ }
+
+ gpa = gpa + page_level_size(level);
+ }
+
+ return 0;
+}
+