Re: [PATCH v2 03/31] x86/virt/tdx: Add tdx_page_array helpers for new TDX Module objects

From: Xu Yilun

Date: Mon Mar 30 2026 - 12:25:09 EST

> > Typical usages of the tdx_page_array:
> >
> > 1. Add control pages:
> > - struct tdx_page_array *array = tdx_page_array_create(nr_pages);
> > - seamcall(TDH_XXX_CREATE, array, ...);
> >
> > 2. Release control pages:
> > - seamcall(TDX_XXX_DELETE, array, &nr_released, &released_hpa);
> > - tdx_page_array_ctrl_release(array, nr_released, released_hpa);
>
> So release is mostly needed because of the need to do the wbvind seamcall? And
> unlike tdx_page_array_free() it returns an error in case that fails. Or other
> sanity checking. But all the callers do the same thing on error, call
> tdx_page_array_ctrl_leak().
>
> Just wondering if we could simplify it somehow. There are two helpers and the
> caller has to know which one to call based on SEAMCALL specifics. What if the
> seamcall wrapper set a bit in the page array while passing it out. The bit would
> specify to the helper if it needs to do wbinvd or not. Then the wrappers could
> encapsulate the type of free needed and not rely on the caller to know. And we
> only need to have one function for it instead of two.

I like the idea, I can have a try.

But we may need more than a bit to finish the release. On release some
SEAMCALLs return the released HPA list and host checks they are sane,
otherwise leak. We need to also record these info in the struct.

>
>
> BTW, do we expect errors from the tdh_phymem_page_wbinvd_hkid() calls here? How

No, it can't happen in normal runtime.

> could the BUSY happen? If we don't think it can happen in normal runtime, we
> could just warn and skip the special leak logic. In KVM side there is a place
> where we can't really handle it for the wbinvd calls. And one where we can. If

But we do leak control pages when wbinvd fails, or even when
tdh_phymem_page_reclaim() fails. So anyway we need leak logic, is it?
Is it a little insane if we failed to reclaim and return the private
page to kernel?

> we need a ton of code to handle a bug somewhere (on kernel side or TDX module),
> it seems too defensive to me. At least it's not in sync with the rest of TDX.
>
> Especially the quite large tdx_page_array_validate_release() logic should need a
> justification that there is something very tricky that needs all this checking.
>
> But maybe you can explain what the special risk is.

I don't see the special risk, actually I don't even see the releasing
failed once.

But we do check the return value of tdh_phymem_page_reclaim() for a
single ctrl page releasing. It is just we also check a list of ctrl
pages releasing. The check becomes naturally complex, e.g., if the
released number matches, if every HPA in the list matches ...

...

> > +struct tdx_page_array {
> > + /* public: */
> > + unsigned int nr_pages;
> > + struct page **pages;
> > +
> > + /* private: */
> > + unsigned int offset;
> > + unsigned int nents;
> > + u64 *root;
>
> pages is going to be an array of struct pointers, and root is a single page of
> PA's that gets re-used to copy and pass the PA's to the TDX module. Why do we
> need both? Like just keep an array of PA's that would be the same size as the
> struct page array. And not need the populate loop?

We need Linux language, struct page *, for alloc and free. Also need
TDX Module language - PA list - for SEAMCALLs. So IIUC, the page to PA
populating won't disappear on allocation, the PA to page populating
would appear on free.

Besides, host may need to vmap and access the (shared) pages.

Thanks,
Yilun