Re: [PATCH v2 05/31] x86/virt/tdx: Extend tdx_page_array to support IOMMU_MT
From: Dan Williams
Date: Tue Apr 21 2026 - 17:53:53 EST
Xu Yilun wrote:
> On Fri, Apr 17, 2026 at 04:58:43PM -0700, Dan Williams wrote:
> > Xu Yilun wrote:
> > [..]
> > > >
> > > > I'm drafting some changes and make the tdx_page_array look like:
> > > >
> > > > struct tdx_page_array {
> > > > /* public: */
> > > > unsigned int nr_pages;
> > > > struct page **pages;
> > > >
> > > > /* private: */
> > > > u64 *root;
> > > > bool flush_on_free;
> >
> > How about "need_phymem_page_wbinvd"?
>
> Yes.
>
> >
> > That makes it a bit more greppable and not to be confused with other
> > flushing.
> >
> > [..]
> > > Hi, I end up made the following changes on top of this series:
> > >
> > > -------8<--------
> > >
> > > arch/x86/include/asm/tdx.h | 32 +-
> > > arch/x86/virt/vmx/tdx/tdx.c | 561 ++++++++------------------
> > > drivers/virt/coco/tdx-host/tdx-host.c | 179 ++++++--
> > > 3 files changed, 316 insertions(+), 456 deletions(-)
> > >
> > > + ret = tdx_ext_mem_setup(nr_pages, &ext_mem);
> > > if (ret)
> > > + return ret;
> > > }
> > >
> > > + ret = tdx_ext_init();
> > > + if (ret)
> > > + goto out_remove_ext_mem;
> > > +
> > > /*
> > > + * Extensions memory is never reclaimed once assigned, stop tracking it
> > > + * and free the tracking structures.
> > > */
> > > + tdx_page_array_free(ext_mem.chunk);
> >
> > Wait, these pages belong to the module now, they can't be freed, or I am
> > missing something?
>
> With this new solution, tdx_page_array is downgraded to a descriptor,
> doesn't manage the actual data pages/memory any more. So
> tdx_page_array_free() will not free data pages, only frees the
> tdx_page_array descriptor.
Oh, I was confused by the fact that tdx_page_array_free() still loops
through array->pages in the need_wbinvd case. In the case of "never
reclaim" it will also "never wbinvd". ...and this why populate has that
"WARN_ON_ONCE(array->pages && array->flush_on_free);".
A couple recommendations come to mind:
* s/tdx_page_array_free/tdx_page_array_destroy/
...since "destroy" mirrors create and matches other cases where only
metadata is managed.
* Create a new tdx_page_array_repopulate() helper to make it clear which
paths depend on being able to repopulate and move the WARN_ON_ONCE() out of
the common path that does not repopulate. "repopulate" can have
"realloc" semantics where it allocates on first use, but otherwise
"populate" gets to not care about the corner cases. Make the WARN case
fail repopulate.
> > > pr_info("%lu KB allocated for TDX Module Extensions\n",
> > > nr_pages * PAGE_SIZE / 1024);
> > >
> > > return 0;
> > >
> > > -out_flush:
> > > - if (ext_mem)
> > > +out_remove_ext_mem:
> > > + if (nr_pages) {
> > > + /*
> > > + * TDH.EXT.MEM.ADD only collects required memory. TDX.EXT.INIT
> > > + * does the actual initialization so if it fails some pages may
> > > + * have been touched by the TDX module, flush cache before
> > > + * returning these pages to kernel.
> > > + */
> > > wbinvd_on_all_cpus();
> > > + tdx_ext_mem_remove(&ext_mem);
> >
> > This only releases the last populated chunk, not all previous chunks,
> > right?
>
> Not true. ext_mem stores all the data pages and the reusable descriptor
> 'chunk' for SEAMCALL. tdx_ext_mem_remove() removes all the data pages
> and the 'chunk'.
Yes, see that now.