Re: [PATCH v4 13/16] iommu/amd: Track host Domain ID mapping for each guest Domain ID

From: Suthikulpanit, Suravee

Date: Wed Nov 05 2025 - 05:50:56 EST


Jason

On 10/23/2025 3:08 AM, Jason Gunthorpe wrote:
On Tue, Oct 21, 2025 at 01:43:21AM +0000, Suravee Suthikulpanit wrote:
Each nested domain is assigned guest domain ID (gDomID), which guest OS
programs into guest Device Table Entry (gDTE). For each gDomID, the driver
assigns a corresponding host domain ID (hDomID), which will be programmed
into the host Device Table Entry (hDTE).

The gDTE to hDTE 1:1 mapping is stored in the nest parent domain using
an xarray (struct protection_domain.gdomid_array). When invalidate the
nest parent domain, the INVALIDATE_IOMMU_PAGES must be issued for each
hDomID in the gdomid_array.

I think this should be stored in the viommu..

It is a small unrealistic detail but very pedantically the API allows
creating two VIOMMU's from the same NEST PARENT domain and if someone
did this then each of the VIOMMU should have its own private gDomID
number space and own separated xarray.

Actually, to support nested translation w/ HW-based vIOMMU support in the guest w/ two VFIO devices on two different physical IOMMUs, it would require setting up two iommufd_viommu structures (one for each IOMMU) and share the same parent domain (single GPA-SPA mapping). Also, AMD HW-vIOMMU use a single domain ID (gDomID-to-hDomID) mapping table per guest-ID. Since the table is indexed using gDomID, it requires single gDomID space per guest.

In this case, it makes more sense to store the gDomID-to-hDomID mapping in the parent domain since:

1. There is one gDomID-space per guest and there is one parent domain per guest.

2. When host issues invalidation for a parent domain, IOMMU driver needs to issue an invalidate command for each hDomId used for the same parent domain (on each IOMMU). We can't do this if we store xarray in the viommu. Otherwise, we would need to store a list of vIOMMUs per parent domain.

Allowing two VIOMMUs to share the same hDomID could be problematic
because we don't know the PASID layout is consistent.

Not sure why PASID layout matters here?
+static int iommu_flush_hdom_ids(struct amd_iommu *iommu,
+ u64 address, size_t size,
+ struct protection_domain *parent)
+{
+ int ret = 0;
+ unsigned long i;
+ struct iommu_cmd cmd;
+ struct nested_domain *ndom;
+
+ xa_for_each(&parent->gdomid_array, i, ndom) {

This doesn't seem right.. There could be many nested_domains sharing
the same gDomID..

I expect this xarray to have a struct like

struct gdomid {
refcount_t users;
u32 hdomid;
};

And each nested_domain will go into the viommu and either allocate a
new gdomid or ++users for the existing one. Inverse when destroying a
nested_domain.

Got it. I have new code for this and will send out in v5 soon.

Thanks,
Suravee