Re: [PATCH Part2 RFC v4 35/40] KVM: Add arch hooks to track the host write to guest memory

From: Brijesh Singh
Date: Tue Jul 20 2021 - 11:54:24 EST




On 7/19/21 6:30 PM, Sean Christopherson wrote:
...>
NAK on converting RMP entries in response to guest accesses. Corrupting guest
data (due to dropping the "validated" flag) on a rogue/incorrect guest emulation
request or misconfigured PV feature is double ungood. The potential kernel panic
below isn't much better.


I also debated myself whether its okay to transition the page state to shared to complete the write operation. I am good with removing the converting RMP entries from the patch, and that will also remove the kernel panic code.


And I also don't think we need this heavyweight flow for user access, e.g.
__copy_to_user(), just eat the RMP violation #PF like all other #PFs and exit
to userspace with -EFAULT.


Yes, we could improve the __copy_to_user() to eat the RMP violation. I was tempted to go on that path but struggled to find a strong reason for it and was not sure if that accepted. I can add that support in next rev.



kvm_vcpu_map() and friends might need the manual lookup, at least initially,

Yes, the enhancement to the __copy_to_user() does not solve this problem and for it we need to do the manually lookup.


but
in an ideal world that would be naturally handled by gup(), e.g. by unmapping
guest private memory or whatever approach TDX ends up employing to avoid #MCs.


+ */
+ if (rmpentry_assigned(e)) {
+ pr_err("SEV-SNP: write to guest private gfn %llx\n", gfn);
+ rc = snp_make_page_shared(kvm_get_vcpu(kvm, 0),
+ gfn << PAGE_SHIFT, pfn, PG_LEVEL_4K);
+ BUG_ON(rc != 0);
+ }
+}

...

+void kvm_arch_write_gfn_begin(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn)
+{
+ update_gfn_track(slot, gfn, KVM_PAGE_TRACK_WRITE, 1);

Tracking only writes isn't correct, as KVM reads to guest private memory will
return garbage. Pulling the rug out from under KVM reads won't fail as
spectacularly as writes (at least not right away), but they'll still fail. I'm
actually ok reading garbage if the guest screws up, but KVM needs consistent
semantics.

Good news is that per-gfn tracking is probably overkill anyways. As mentioned
above, user access don't need extra magic, they either fail or they don't.

For kvm_vcpu_map(), one thought would be to add a "short-term" map variant that
is not allowed to be retained across VM-Entry, and then use e.g. SRCU to block
PSC requests until there are no consumers.


Sounds good to me, i will add the support in the next rev.

thanks