From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
Update TDX's hook of set_external_spte() to record pre-mapping cnt instead
of doing nothing and returning when TD is not finalized.
TDX uses ioctl KVM_TDX_INIT_MEM_REGION to initialize its initial guest
memory. This ioctl calls kvm_gmem_populate() to get guest pages and in
tdx_gmem_post_populate(), it will
(1) Map page table pages into KVM mirror page table and private EPT.
(2) Map guest pages into KVM mirror page table. In the propagation hook,
just record pre-mapping cnt without mapping the guest page into private
EPT.
(3) Map guest pages into private EPT and decrease pre-mapping cnt.
Do not map guest pages into private EPT directly in step (2), because TDX
requires TDH.MEM.PAGE.ADD() to add a guest page before TD is finalized,
which copies page content from a source page from user to target guest page
to be added. However, source page is not available via common interface
kvm_tdp_map_page() in step (2).
Therefore, just pre-map the guest page into KVM mirror page table and
record the pre-mapping cnt in TDX's propagation hook. The pre-mapping cnt
would be decreased in ioctl KVM_TDX_INIT_MEM_REGION when the guest page is
mapped into private EPT.
Co-developed-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
---
TDX MMU part 2 v1:
- Update the code comment and patch log according to latest gmem update.
https://lore.kernel.org/kvm/CABgObfa=a3cKcKJHQRrCs-3Ty8ppSRou=dhi6Q+KdZnom0Zegw@xxxxxxxxxxxxxx/
- Rename tdx_mem_page_add() to tdx_mem_page_record_premap_cnt() to avoid
confusion.
- Change the patch title to "KVM: TDX: Premap initial guest memory".
- Rename KVM_MEMORY_MAPPING => KVM_MAP_MEMORY (Sean)
- Drop issueing TDH.MEM.PAGE.ADD() on KVM_MAP_MEMORY(), defer it to
KVM_TDX_INIT_MEM_REGION. (Sean)
- Added nr_premapped to track the number of premapped pages
- Drop tdx_post_mmu_map_page().
v19:
- Switched to use KVM_MEMORY_MAPPING
- Dropped measurement extension
- updated commit message. private_page_add() => set_private_spte()
---
arch/x86/kvm/vmx/tdx.c | 40 +++++++++++++++++++++++++++++++++-------
arch/x86/kvm/vmx/tdx.h | 2 +-
2 files changed, 34 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 59b627b45475..435112562954 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -488,6 +488,34 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn,
return 0;
}
+/*
+ * KVM_TDX_INIT_MEM_REGION calls kvm_gmem_populate() to get guest pages and
+ * tdx_gmem_post_populate() to premap page table pages into private EPT.
+ * Mapping guest pages into private EPT before TD is finalized should use a
+ * seamcall TDH.MEM.PAGE.ADD(), which copies page content from a source page
+ * from user to target guest pages to be added. This source page is not
+ * available via common interface kvm_tdp_map_page(). So, currently,
+ * kvm_tdp_map_page() only premaps guest pages into KVM mirrored root.
+ * A counter nr_premapped is increased here to record status. The counter will
+ * be decreased after TDH.MEM.PAGE.ADD() is called after the kvm_tdp_map_page()
+ * in tdx_gmem_post_populate().
+ */
+static int tdx_mem_page_record_premap_cnt(struct kvm *kvm, gfn_t gfn,
+ enum pg_level level, kvm_pfn_t pfn)
+{
+ struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
+
+ /* Returning error here to let TDP MMU bail out early. */
+ if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) {
+ tdx_unpin(kvm, pfn);
+ return -EINVAL;
+ }
+
+ /* nr_premapped will be decreased when tdh_mem_page_add() is called. */
+ atomic64_inc(&kvm_tdx->nr_premapped);
+ return 0;
+}
+
int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn,
enum pg_level level, kvm_pfn_t pfn)
{
@@ -510,11 +538,7 @@ int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn,
if (likely(is_td_finalized(kvm_tdx)))
return tdx_mem_page_aug(kvm, gfn, level, pfn);
- /*
- * TODO: KVM_MAP_MEMORY support to populate before finalize comes
- * here for the initial memory.
- */
- return 0;
+ return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn);
}
static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
@@ -546,10 +570,12 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
if (unlikely(!is_td_finalized(kvm_tdx) &&
err == (TDX_EPT_WALK_FAILED | TDX_OPERAND_ID_RCX))) {
/*
- * This page was mapped with KVM_MAP_MEMORY, but
- * KVM_TDX_INIT_MEM_REGION is not issued yet.
+ * Page is mapped by KVM_TDX_INIT_MEM_REGION, but hasn't called
+ * tdh_mem_page_add().
*/
if (!is_last_spte(entry, level) || !(entry & VMX_EPT_RWX_MASK)) {
+ WARN_ON_ONCE(!atomic64_read(&kvm_tdx->nr_premapped));
+ atomic64_dec(&kvm_tdx->nr_premapped);
tdx_unpin(kvm, pfn);
return 0;
}
diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
index 66540c57ed61..25a4aaede2ba 100644
--- a/arch/x86/kvm/vmx/tdx.h
+++ b/arch/x86/kvm/vmx/tdx.h
@@ -26,7 +26,7 @@ struct kvm_tdx {
u64 tsc_offset;
- /* For KVM_MAP_MEMORY and KVM_TDX_INIT_MEM_REGION. */
+ /* For KVM_TDX_INIT_MEM_REGION. */
atomic64_t nr_premapped;
struct kvm_cpuid2 *cpuid;