Re: [PATCH v2 05/31] x86/virt/tdx: Extend tdx_page_array to support IOMMU_MT
From: Xu Yilun
Date: Tue Mar 31 2026 - 10:41:37 EST
> > +static int tdx_alloc_pages_iommu_mt(unsigned int nr_pages, struct page **pages,
> > + void *data)
> > +{
> > + unsigned int iq_order = (unsigned int)(long)data;
> > + struct folio *t_iq, *t_ctxiq;
> > + int ret;
> > +
> > + /* TODO: folio_alloc_node() is preferred, but need numa info */
> > + t_iq = folio_alloc(GFP_KERNEL | __GFP_ZERO, iq_order);
> > + if (!t_iq)
> > + return -ENOMEM;
> > +
> > + t_ctxiq = folio_alloc(GFP_KERNEL | __GFP_ZERO, iq_order);
> > + if (!t_ctxiq) {
> > + ret = -ENOMEM;
> > + goto out_t_iq;
> > + }
> > +
> > + ret = tdx_alloc_pages_bulk(nr_pages - 2, pages + 2, NULL);
> > + if (ret)
> > + goto out_t_ctxiq;
> > +
> > + pages[0] = folio_page(t_iq, 0);
> > + pages[1] = folio_page(t_ctxiq, 0);
>
> To me it seems like this can't really be called a page array any more. The first
> two u64's are too special. Instead it's a special one-off ABI format passed via
> a page.
>
> BTW, I can't find TDH.IOMMU.SETUP in the docs. Any pointers?
https://cdrdv2.intel.com/v1/dl/getContent/858625
>
> > +
> > + return 0;
> > +
> > +out_t_ctxiq:
> > + folio_put(t_ctxiq);
> > +out_t_iq:
> > + folio_put(t_iq);
> > +
> > + return ret;
> > +}
> > +
> > +/**
> > + * tdx_page_array_create_iommu_mt() - Create a page array for IOMMU Memory Tables
> > + * @iq_order: The allocation order for the IOMMU Invalidation Queue.
> > + * @nr_mt_pages: Number of additional order-0 pages for the MT.
> > + *
> > + * Allocate and populate a specialized tdx_page_array for IOMMU_MT structures.
> > + * The resulting array consists of two multi-order folios (at index 0 and 1)
> > + * followed by the requested number of order-0 pages.
> > + *
> > + * Return: Fully populated tdx_page_array or NULL on failure> > + */
> > +struct tdx_page_array *
> > +tdx_page_array_create_iommu_mt(unsigned int iq_order, unsigned int nr_mt_pages)
> > +{
> > + unsigned int nr_pages = nr_mt_pages + 2;
>
> Consider the amount of tricks that are needed to coax the tdx_page_array to
> populate the handoff page as needed. It adds 2 pages here, then subtracts them
> later in the callback. Then tweaks the pa in tdx_page_array_populate() to add
> the length...
mm.. The tricky part is the specific memory requirement/allocation, the
common part is the pa list contained in a root page. Maybe we only model
the later, let the specific user does the memory allocation. Is that
closer to your "break concepts apart" idea?
>
> > + struct tdx_page_array *array;
> > + int populated;
> > +
> > + if (nr_pages > TDX_PAGE_ARRAY_MAX_NENTS)
> > + return NULL;
> > +
> > + array = tdx_page_array_alloc(nr_pages, tdx_alloc_pages_iommu_mt,
> > + (void *)(long)iq_order);
> > + if (!array)
> > + return NULL;
> > +
> > + populated = tdx_page_array_populate(array, 0);
> > + if (populated != nr_pages)
> > + goto out_free;
> > +
> > + return array;
> > +
> > +out_free:
> > + tdx_page_array_free(array);
> > + return NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(tdx_page_array_create_iommu_mt);