Re: [PATCH 4/8] KVM: gmem: protect kvm_mmu_invalidate_end()

From: Kalra, Ashish
Date: Thu Aug 31 2023 - 12:50:35 EST


Hello Sean,

On 8/22/2023 6:17 PM, Sean Christopherson wrote:
On Mon, Aug 21, 2023, Ashish Kalra wrote:
Hello Mingwei & Sean,

On 8/18/2023 9:08 PM, Mingwei Zhang wrote:
The maximum hits are seen with shmem_fallocate and madvise, which we believe
are response to shared<->private
GHCB page-state-chage requests. discard=both handles discard both for
private and shared memory, so freeing shared memory
via fallocate(shared_memfd, FALLOC_FL_PUNCH_HOLE, ...) would trigger the
notifiers when freeing shared pages after guest converts a GPA to
private.

Now, as with SNP+guest_memfd, guest private memory is not mapped in host
anymore, so i added a generic fix (instead of Sean's proposed patch of
checking for SNP guest inside sev_guest_memory_reclaimed()):

--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -593,6 +593,9 @@ static __always_inline int __kvm_handle_hva_range(struct
kvm *kvm,
unsigned long hva_start, hva_end;

slot = container_of(node, struct kvm_memory_slot,
hva_node[slots->node_idx]);
+ if (kvm_slot_can_be_private(slot)) {
+ continue;
+ }
hva_start = max(range->start, slot->userspace_addr);
hva_end = min(range->end, slot->userspace_addr +
(slot->npages <<
PAGE_SHIFT));

...

As expected, the SEV hook is not invoked for the guest private memory pages
(no more invalidation from shmem_fallocate() + madvise()).

Isn't it better to skip invoking the KVM MMU invalidation notifier when the
invalidated range belongs to guest private memory ?

Oooh, you're running into problems where KVM blasts both the private and shared
mappings even though invalidations from the mmu_notifier are shared-only by
definition.

The answer is "yes", but simply skipping slots that _can_ be private is wrong,
as KVM still needs to zap any shared mappings. I have a plan[*], but I completely
spaced on incorporating the idea into the gmem RFC. I'll add that to the "list
of todos for merging gmem", which I need to get sent out asap.

https://lore.kernel.org/all/ZJX0hk+KpQP0KUyB@xxxxxxxxxx

Looking at your gmem TODO's post, i don't see anything specific for this support:

https://lore.kernel.org/kvm/ZOjpIL0SFH+E3Dj4@xxxxxxxxxx/

Thanks,
Ashish


In fact, AFAIC, SNP VM does not track whether each page is previously
shared, isn't it? If a page was previously shared and was written by the
host kernel or devices before it was changed to private. No one tracks it
and dirty caches are there!

The skipped invalidation here covered the case Mingwei mentioned above,
where the pages are changed from private->shared and subsequent freeing of
shared pages triggered the invalidation.

But, then why are we concerned about this, i thought we have concerns about
the case where the dirty cache lines contain encrypted guest data ?

Yes, that's my understanding as well (assuming by "this" you mean the case where
the CPU cache has dirty lines for _shared_ addresses).