[PATCH 2/2] KVM: Do not speculatively mark pfn cache valid to "fix" race

From: Sean Christopherson
Date: Tue Apr 19 2022 - 20:49:19 EST


When refreshing a stale pfn cache, mark it valid only after the new pfn
(and maybe kernel mapping) has been acquired. Speculatively marking the
cache as valid with garbage pfn/khva values can lead to other tasks using
the garbage values. Presumably, the cache was previously marked valid
before dropping gpc->lock to address a race where an mmu_notifier event
could invalidate the cache between retrieving the pfn via hva_to_pfn()
and re-acquiring gpc->lock. That race has been plugged by waiting for
mn_active_invalidate_count to reach zero before returning from
hva_to_pfn_retry(), i.e. waiting for any notifiers to fully complete
before returning.

Note, this could also be "fixed" by having kvm_gfn_to_pfn_cache_check()
also verify gpc->hkva, but that's unnecessary now that the aforementioned
race between refresh and mmu_notifier invalidation has been eliminated.

CPU0 CPU1
---- ----

gpc->valid == false

kvm_gfn_to_pfn_cache_refresh()
|
|-> gpc->valid = true;

hva_to_pfn_retry()
|
-> drop gpc->lock

acquire gpc->lock for read
kvm_gfn_to_pfn_cache_check()
|
|-> gpc->valid == true
gpc->khva == NULL

caller dereferences NULL khva

Opportunistically add a lockdep to the check() helper to make it slightly
more obvious that there is no memory ordering issue when setting "valid"
versus "pfn" and/or "khva".

Fixes: 982ed0de4753 ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
Cc: stable@xxxxxxxxxxxxxxx
Cc: David Woodhouse <dwmw@xxxxxxxxxxxx>
Cc: Mingwei Zhang <mizhang@xxxxxxxxxx>
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
virt/kvm/pfncache.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
index 71c84a43024c..72eee096a7cd 100644
--- a/virt/kvm/pfncache.c
+++ b/virt/kvm/pfncache.c
@@ -81,6 +81,8 @@ bool kvm_gfn_to_pfn_cache_check(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
{
struct kvm_memslots *slots = kvm_memslots(kvm);

+ lockdep_assert_held_read(&gpc->lock);
+
if ((gpa & ~PAGE_MASK) + len > PAGE_SIZE)
return false;

@@ -226,11 +228,6 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
if (!old_valid || old_uhva != gpc->uhva) {
void *new_khva = NULL;

- /* Placeholders for "hva is valid but not yet mapped" */
- gpc->pfn = KVM_PFN_ERR_FAULT;
- gpc->khva = NULL;
- gpc->valid = true;
-
new_pfn = hva_to_pfn_retry(kvm, gpc);
if (is_error_noslot_pfn(new_pfn)) {
ret = -EFAULT;
@@ -259,7 +256,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc,
gpc->pfn = KVM_PFN_ERR_FAULT;
gpc->khva = NULL;
} else {
- /* At this point, gpc->valid may already have been cleared */
+ gpc->valid = true;
gpc->pfn = new_pfn;
gpc->khva = new_khva;
}
--
2.36.0.rc0.470.gd361397f0d-goog