[PATCH 1/7] KVM: TDX: Return -EBUSY when tdh_mem_page_add() encounters TDX_OPERAND_BUSY

From: Yan Zhao
Date: Sun Jan 12 2025 - 21:11:51 EST


tdh_mem_page_add() is called during TD build time. Within the TDX module,
it acquires the exclusive lock on the TDR resource (eliminating the need to
hold locks for TDCS/SEPT tree) and the exclusive lock on the PAMT entry for
the page to be added. The TDX module returns TDX_OPERAND_BUSY if
tdh_mem_page_add() contends with other SEAMCALLs.

SEAMCALL Lock Type Resource
-----------------------------------------------------------------------
tdh_mem_page_add EXCLUSIVE TDR
NO_LOCK TDCS
NO_LOCK SEPT tree
EXCLUSIVE PAMT entry for the page to add

Given (1)-(4) and the expected behavior from userspace (5), KVM doesn't
expect tdh_mem_page_add() to encounter TDX_OPERAND_BUSY:
(1) tdx_vcpu_create() only allows vCPU creation when the TD state is
TD_STATE_INITIALIZED, so tdh_mem_page_add(), as invoked in vCPU ioctl,
does not contend with
tdh_mng_create()/tdh_mng_addcx()/tdh_mng_key_config()/tdh_mng_init().

(2) tdx_vcpu_ioctl() bails out on TD_STATE_RUNNABLE, so tdh_mem_page_add()
does not contend with tdh_vp_enter()/tdh_mem_page_aug()/tdh_mem_track()
and TDCALLs.

(3) By holding slots_lock and the filemap invalidate lock,
tdh_mem_page_add() does not contend with tdh_mr_finalize(),
tdh_mem_page_remove()/tdh_mem_range_block()/
tdh_phymem_page_wbinvd_hkid() or another tdh_mem_page_add(),
tdh_mem_sept_add()/tdh_mr_extend().

(4) By holding reference to kvm, tdh_mem_page_add() does not contend with
tdh_mng_vpflushdone()/tdh_phymem_cache_wb()/tdh_mng_key_freeid()/
tdh_phymem_page_wbinvd_tdr()/tdh_phymem_page_reclaim().

(5) A well-behaved userspace invokes ioctl KVM_TDX_INIT_MEM_REGION on one
vCPU after initializing all vCPUs and does not invoke ioctls on the
other vCPUs before the KVM_TDX_INIT_MEM_REGION completes. Thus,
tdh_mem_page_add() does not contend with
tdh_vp_create()/tdh_vp_addcx()/tdh_vp_init*()/tdh_vp_rd()/tdh_vp_wr()/
tdh_mng_rd()/tdh_vp_flush() on the other vCPUs.

However, if the userspace breaks (5), tdh_mem_page_add() could encounter
TDX_OPERAND_BUSY when trying to acquire the exclusive lock on the TDR
resource in the TDX module. In this case, simply return -EBUSY.

Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
---
tdx_vcpu_pre_run() will check TD_STATE_RUNNABLE for (2).
https://lore.kernel.org/kvm/3576c721-3ef2-40bd-8764-b50912df93a2@xxxxxxxxx/
---
arch/x86/kvm/vmx/tdx.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index d0dc3200fa37..1cf3ef0faff7 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -3024,13 +3024,11 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
}

ret = 0;
- do {
- err = tdh_mem_page_add(kvm_tdx->tdr_pa, gpa, pfn_to_hpa(pfn),
- pfn_to_hpa(page_to_pfn(page)),
- &entry, &level_state);
- } while (err == TDX_ERROR_SEPT_BUSY);
+ err = tdh_mem_page_add(kvm_tdx->tdr_pa, gpa, pfn_to_hpa(pfn),
+ pfn_to_hpa(page_to_pfn(page)),
+ &entry, &level_state);
if (err) {
- ret = -EIO;
+ ret = unlikely(err & TDX_OPERAND_BUSY) ? -EBUSY : -EIO;
goto out;
}

--
2.43.2