[PATCH v6 05/11] x86/virt/tdx: Handle concurrent callers in tdx_pamt_get/put()
From: Rick Edgecombe
Date: Mon May 25 2026 - 22:37:30 EST
From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
tdx_pamt_get()/tdx_pamt_put() unconditionally add or remove Dynamic PAMT
backing for the 2MB region covering the passed pfn. However, multiple
callers can concurrently operate on 4KB pages that fall within the same
2MB region. When this happens only one Dynamic PAMT page pair needs to be
installed to cover the 2MB range. And when one page is freed, the Dynamic
PAMT backing cannot be freed until all pages in the range are no longer in
use. Make the helpers handle these races internally.
Use the per-2MB refcounts from previous changes to track how many 4KB
pages are in use within each region. Gate the actual Dynamic PAMT add and
remove on refcount transitions (0->1 and 1->0). Serialize the refcount
check and SEAMCALL with a global spinlock so the read-decide-act sequence
is atomic. This also avoids TDX module BUSY errors, as Dynamic PAMT add
and remove SEAMCALLs take an internal TDX module locks at 2MB granularity,
so simultaneous attempts on the same region would conflict.
The lock is global and heavyweight. Use simple conditional logic to keep
correctness obvious. This will be optimized in a later change.
Assisted-by: GitHub Copilot:claude-opus-4-6 Claude:claude-opus-4-7
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
---
v6:
- Split from "x86/virt/tdx: Add tdx_alloc/free_control_page() helpers"
- Return 0 instead of ret to be clearer (Binbin)
- Clarify log (Nikolay)
- Justify why the patch is not optimized in response to comments by
(Nikolay)
- Move tdx_find_pamt_refcount() to faciliate patch re-order
- Adjustments from dropping error helper patches
- Log tweaks
---
arch/x86/virt/vmx/tdx/tdx.c | 72 ++++++++++++++++++++++++++++---------
1 file changed, 56 insertions(+), 16 deletions(-)
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 6658a6be6697c..50333eb96efa6 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -2043,10 +2043,14 @@ static u64 tdh_phymem_pamt_remove(kvm_pfn_t pfn, struct page **pamt_pages)
return 0;
}
-/* Allocate PAMT memory for the given page */
+/* Serializes adding/removing PAMT memory */
+static DEFINE_SPINLOCK(pamt_lock);
+
+/* Bump PAMT refcount for the given page and allocate PAMT memory if needed */
static int tdx_pamt_get(kvm_pfn_t pfn)
{
struct page *pamt_pages[TDX_DPAMT_ENTRY_PAGE_CNT];
+ atomic_t *pamt_refcount;
u64 tdx_status;
int ret;
@@ -2057,10 +2061,26 @@ static int tdx_pamt_get(kvm_pfn_t pfn)
if (ret)
return ret;
- tdx_status = tdh_phymem_pamt_add(pfn, pamt_pages);
- if (tdx_status != TDX_SUCCESS) {
- ret = -EIO;
- goto out_free;
+ pamt_refcount = tdx_find_pamt_refcount(pfn);
+
+ scoped_guard(spinlock, &pamt_lock) {
+ /*
+ * If the pamt page is already added (i.e. refcount >= 1),
+ * then just increment the refcount.
+ */
+ if (atomic_read(pamt_refcount)) {
+ atomic_inc(pamt_refcount);
+ goto out_free;
+ }
+
+ /* Try to add the pamt page and take the refcount 0->1. */
+ tdx_status = tdh_phymem_pamt_add(pfn, pamt_pages);
+ if (WARN_ON_ONCE(tdx_status != TDX_SUCCESS)) {
+ ret = -EIO;
+ goto out_free;
+ }
+
+ atomic_set(pamt_refcount, 1);
}
return 0;
@@ -2069,26 +2089,46 @@ static int tdx_pamt_get(kvm_pfn_t pfn)
return ret;
}
-/* Free PAMT memory for the given page */
+/*
+ * Drop PAMT refcount for the given page and free PAMT memory if it is no
+ * longer needed.
+ */
static void tdx_pamt_put(kvm_pfn_t pfn)
{
struct page *pamt_pages[TDX_DPAMT_ENTRY_PAGE_CNT] = {};
+ atomic_t *pamt_refcount;
u64 tdx_status;
if (!tdx_supports_dynamic_pamt(&tdx_sysinfo))
return;
- tdx_status = tdh_phymem_pamt_remove(pfn, pamt_pages);
+ pamt_refcount = tdx_find_pamt_refcount(pfn);
- /*
- * Don't free pamt_pages as it could hold garbage when
- * tdh_phymem_pamt_remove() fails. Don't panic/BUG_ON(), as
- * there is no risk of data corruption, but do yell loudly as
- * failure indicates a kernel bug, memory is being leaked, and
- * the dangling PAMT entry may cause future operations to fail.
- */
- if (WARN_ON_ONCE(tdx_status != TDX_SUCCESS))
- return;
+ scoped_guard(spinlock, &pamt_lock) {
+ /*
+ * If the there are more than 1 references on the pamt page,
+ * don't remove it yet. Just decrement the refcount.
+ */
+ if (atomic_read(pamt_refcount) > 1) {
+ atomic_dec(pamt_refcount);
+ return;
+ }
+
+ /* Try to remove the pamt page and take the refcount 1->0. */
+ tdx_status = tdh_phymem_pamt_remove(pfn, pamt_pages);
+
+ /*
+ * Don't free pamt_pages as it could hold garbage when
+ * tdh_phymem_pamt_remove() fails. Don't panic/BUG_ON(), as
+ * there is no risk of data corruption, but do yell loudly as
+ * failure indicates a kernel bug, memory is being leaked, and
+ * the dangling PAMT entry may cause future operations to fail.
+ */
+ if (WARN_ON_ONCE(tdx_status != TDX_SUCCESS))
+ return;
+
+ atomic_set(pamt_refcount, 0);
+ }
free_pamt_array(pamt_pages);
}
--
2.54.0