Re: [PATCH v3 01/24] x86/tdx: Enhance tdh_mem_page_aug() to support huge pages

From: Dave Hansen

Date: Tue Jan 06 2026 - 16:08:01 EST


On 1/6/26 02:18, Yan Zhao wrote:
> Enhance the SEAMCALL wrapper tdh_mem_page_aug() to support huge pages.
>
> The SEAMCALL TDH_MEM_PAGE_AUG currently supports adding physical memory to
> the S-EPT up to 2MB in size.
>
> While keeping the "level" parameter in the tdh_mem_page_aug() wrapper to
> allow callers to specify the physical memory size, introduce the parameters
> "folio" and "start_idx" to specify the physical memory starting from the
> page at "start_idx" within the "folio". The specified physical memory must
> be fully contained within a single folio.
>
> Invoke tdx_clflush_page() for each 4KB segment of the physical memory being
> added. tdx_clflush_page() performs CLFLUSH operations conservatively to
> prevent dirty cache lines from writing back later and corrupting TD memory.

This changelog is heavy on the "what" and weak on the "why". It's not
telling me what I need to know.

...
> + struct folio *folio = page_folio(page);
> gpa_t gpa = gfn_to_gpa(gfn);
> u64 entry, level_state;
> u64 err;
>
> - err = tdh_mem_page_aug(&kvm_tdx->td, gpa, tdx_level, page, &entry, &level_state);
> -
> + err = tdh_mem_page_aug(&kvm_tdx->td, gpa, tdx_level, folio,
> + folio_page_idx(folio, page), &entry, &level_state);
> if (unlikely(IS_TDX_OPERAND_BUSY(err)))
> return -EBUSY;

For example, 'folio' is able to be trivially derived from page. Yet,
this removes the 'page' argument and replaces it with 'folio' _and_
another value which can be derived from 'page'.

This looks superficially like an illogical change. *Why* was this done?

> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index b0b33f606c11..41ce18619ffc 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1743,16 +1743,23 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
> }
> EXPORT_SYMBOL_GPL(tdh_vp_addcx);
>
> -u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2)
> +u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct folio *folio,
> + unsigned long start_idx, u64 *ext_err1, u64 *ext_err2)
> {
> struct tdx_module_args args = {
> .rcx = gpa | level,
> .rdx = tdx_tdr_pa(td),
> - .r8 = page_to_phys(page),
> + .r8 = page_to_phys(folio_page(folio, start_idx)),
> };
> + unsigned long npages = 1 << (level * PTE_SHIFT);
> u64 ret;

This 'npages' calculation is not obviously correct. It's not clear what
"level" is or what values it should have.

This is precisely the kind of place to deploy a helper that explains
what is going on.

> - tdx_clflush_page(page);
> + if (start_idx + npages > folio_nr_pages(folio))
> + return TDX_OPERAND_INVALID;

Why is this necessary? Would it be a bug if this happens?

> + for (int i = 0; i < npages; i++)
> + tdx_clflush_page(folio_page(folio, start_idx + i));

All of the page<->folio conversions are kinda hurting my brain. I think
we need to decide what the canonical type for these things is in TDX, do
the conversion once, and stick with it.