Re: [PATCH v4 14/16] KVM: TDX: Reclaim PAMT memory

From: Binbin Wu

Date: Wed Nov 26 2025 - 03:53:54 EST




On 11/21/2025 8:51 AM, Rick Edgecombe wrote:
From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>

Call tdx_free_page() and tdx_pamt_put() on the paths that free TDX
pages.

The PAMT memory holds metadata for TDX-protected memory. With Dynamic
PAMT, PAMT_4K is allocated on demand. The kernel supplies the TDX module
with a few pages that cover 2M of host physical memory.

PAMT memory can be reclaimed when the last user is gone. It can happen
in a few code paths:

- On TDH.PHYMEM.PAGE.RECLAIM in tdx_reclaim_td_control_pages() and
tdx_reclaim_page().

- On TDH.MEM.PAGE.REMOVE in tdx_sept_drop_private_spte().

- In tdx_sept_zap_private_spte() for pages that were in the queue to be
added with TDH.MEM.PAGE.ADD, but it never happened due to an error.

- In tdx_sept_free_private_spt() for SEPT pages;

Add tdx_pamt_put() for memory that comes from guest_memfd and use
tdx_free_page() for the rest.

External page table pages are not from guest_memfd,  but tdx_pamt_put() is used
in tdx_sept_free_private_spt() for them.


Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
[Minor log tweak]
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
---
v4:
- Rebasing on post-populate series required some changes on how PAMT
refcounting was handled in the KVM_TDX_INIT_MEM_REGION path. Now
instead of incrementing DPAMT refcount on the fake add in the fault
path, it only increments it when tdh_mem_page_add() actually succeeds,
like in tdx_mem_page_aug(). Because of this, the special handling for
the case tdx_is_sept_zap_err_due_to_premap() cared about is unneeded.

v3:
- Minor log tweak to conform kvm/x86 style.
---
arch/x86/kvm/vmx/tdx.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 24322263ac27..f8de50e7dc7f 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -360,7 +360,7 @@ static void tdx_reclaim_control_page(struct page *ctrl_page)
if (tdx_reclaim_page(ctrl_page))
return;
- __free_page(ctrl_page);
+ tdx_free_page(ctrl_page);
}
struct tdx_flush_vp_arg {
@@ -597,7 +597,7 @@ static void tdx_reclaim_td_control_pages(struct kvm *kvm)
tdx_quirk_reset_page(kvm_tdx->td.tdr_page);
- __free_page(kvm_tdx->td.tdr_page);
+ tdx_free_page(kvm_tdx->td.tdr_page);
kvm_tdx->td.tdr_page = NULL;
}
@@ -1827,6 +1827,8 @@ static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn,
enum pg_level level, void *private_spt)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
+ struct page *page = virt_to_page(private_spt);
+ int ret;
/*
* free_external_spt() is only called after hkid is freed when TD is
@@ -1843,7 +1845,12 @@ static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn,
* The HKID assigned to this TD was already freed and cache was
* already flushed. We don't have to flush again.
*/
- return tdx_reclaim_page(virt_to_page(private_spt));
+ ret = tdx_reclaim_page(virt_to_page(private_spt));
+ if (ret)
+ return ret;
+
+ tdx_pamt_put(page);
+ return 0;
}
static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
@@ -1895,6 +1902,7 @@ static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn,
return;
tdx_quirk_reset_page(page);
+ tdx_pamt_put(page);
}
void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode,