[PATCH] Documentation: userspace-api: iommufd: Update HWPT_PAGING and HWPT_NESTED

From: Nicolin Chen
Date: Tue Sep 10 2024 - 16:42:00 EST


The previous IOMMUFD_OBJ_HW_PAGETABLE has been reworked to two separated
objects: IOMMUFD_OBJ_HWPT_PAGING and IOMMUFD_OBJ_HWPT_NESTED in order to
support a nested translation context.

Corresponding to the latest iommufd APIs and uAPIs, update the doc so as
to reflect the new design.

Signed-off-by: Nicolin Chen <nicolinc@xxxxxxxxxx>
---
Documentation/userspace-api/iommufd.rst | 131 ++++++++++++++++--------
1 file changed, 89 insertions(+), 42 deletions(-)

diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
index aa004faed5fd..3b0e46017dce 100644
--- a/Documentation/userspace-api/iommufd.rst
+++ b/Documentation/userspace-api/iommufd.rst
@@ -41,46 +41,65 @@ Following IOMMUFD objects are exposed to userspace:
- IOMMUFD_OBJ_DEVICE, representing a device that is bound to iommufd by an
external driver.

-- IOMMUFD_OBJ_HW_PAGETABLE, representing an actual hardware I/O page table
- (i.e. a single struct iommu_domain) managed by the iommu driver.
-
- The IOAS has a list of HW_PAGETABLES that share the same IOVA mapping and
- it will synchronize its mapping with each member HW_PAGETABLE.
+- IOMMUFD_OBJ_HWPT_PAGING, representing an actual hardware I/O page table
+ (i.e. a single struct iommu_domain) managed by the iommu driver. "PAGING"
+ primarly indicates this type of HWPT should be linked to an IOAS. It also
+ indicates that it is backed by an iommu_domain with __IOMMU_DOMAIN_PAGING
+ feature flag. This can be either an UNMANAGED stage-1 domain for a device
+ running in the user space, or a nesting parent stage-2 domain for mappings
+ from guest-level physical addresses to host-level physical addresses.
+
+ The IOAS has a list of HWPT_PAGINGs that share the same IOVA mapping and
+ it will synchronize its mapping with each member HWPT_PAGING.
+
+- IOMMUFD_OBJ_HWPT_NESTED, representing an actual hardware I/O page table
+ (i.e. a single struct iommu_domain) managed by user space (e.g. guest OS).
+ "NESTED" indicates that this type of HWPT can be linked to an HWPT_PAGING.
+ It also indicates that it is backed by an iommu_domain that has a type of
+ IOMMU_DOMAIN_NESTED. This must be a stage-1 domain for a device running in
+ the user space (e.g. in a guest VM enabling the IOMMU nested translation
+ feature.) So it must be created with a given nesting parent stage-2 domain
+ to associate to. This nested stage-1 page table managed by the user space
+ usually has mappings from guest-level I/O virtual addresses to guest-level
+ physical addresses.

All user-visible objects are destroyed via the IOMMU_DESTROY uAPI.

-The diagram below shows relationship between user-visible objects and kernel
+The diagrams below show relationships between user-visible objects and kernel
datastructures (external to iommufd), with numbers referred to operations
creating the objects and links::

- _________________________________________________________
- | iommufd |
- | [1] |
- | _________________ |
- | | | |
- | | | |
- | | | |
- | | | |
- | | | |
- | | | |
- | | | [3] [2] |
- | | | ____________ __________ |
- | | IOAS |<--| |<------| | |
- | | | |HW_PAGETABLE| | DEVICE | |
- | | | |____________| |__________| |
- | | | | | |
- | | | | | |
- | | | | | |
- | | | | | |
- | | | | | |
- | |_________________| | | |
- | | | | |
- |_________|___________________|___________________|_______|
- | | |
- | _____v______ _______v_____
- | PFN storage | | | |
- |------------>|iommu_domain| |struct device|
- |____________| |_____________|
+ _______________________________________________________________________
+ | iommufd (HWPT_PAGING only) |
+ | |
+ | [1] [3] [2] |
+ | ________________ _____________ ________ |
+ | | | | | | | |
+ | | IOAS |<---| HWPT_PAGING |<---------------------| DEVICE | |
+ | |________________| |_____________| |________| |
+ | | | | |
+ |_________|____________________|__________________________________|_____|
+ | | |
+ | ______v_____ ___v__
+ | PFN storage | (paging) | |struct|
+ |------------>|iommu_domain|<-----------------------|device|
+ |____________| |______|
+
+ _______________________________________________________________________
+ | iommufd (with HWPT_NESTED) |
+ | |
+ | [1] [3] [4] [2] |
+ | ________________ _____________ _____________ ________ |
+ | | | | | | | | | |
+ | | IOAS |<---| HWPT_PAGING |<---| HWPT_NESTED |<--| DEVICE | |
+ | |________________| |_____________| |_____________| |________| |
+ | | | | | |
+ |_________|____________________|__________________|_______________|_____|
+ | | | |
+ | ______v_____ ______v_____ ___v__
+ | PFN storage | (paging) | | (nested) | |struct|
+ |------------>|iommu_domain|<----|iommu_domain|<----|device|
+ |____________| |____________| |______|

1. IOMMUFD_OBJ_IOAS is created via the IOMMU_IOAS_ALLOC uAPI. An iommufd can
hold multiple IOAS objects. IOAS is the most generic object and does not
@@ -94,21 +113,48 @@ creating the objects and links::
device. The driver must also set the driver_managed_dma flag and must not
touch the device until this operation succeeds.

-3. IOMMUFD_OBJ_HW_PAGETABLE is created when an external driver calls the IOMMUFD
+3. IOMMUFD_OBJ_HWPT_PAGING can be created in two ways:
+
+ IOMMUFD_OBJ_HWPT_PAGING is created when an external driver calls the IOMMUFD
kAPI to attach a bound device to an IOAS. Similarly the external driver uAPI
allows userspace to initiate the attaching operation. If a compatible
pagetable already exists then it is reused for the attachment. Otherwise a
new pagetable object and iommu_domain is created. Successful completion of
this operation sets up the linkages among IOAS, device and iommu_domain. Once
- this completes the device could do DMA.
-
- Every iommu_domain inside the IOAS is also represented to userspace as a
- HW_PAGETABLE object.
+ this completes the device could do DMA. Note that every iommu_domain inside
+ the IOAS is also represented to userspace as an IOMMUFD_OBJ_HWPT_PAGING.
+
+ IOMMUFD_OBJ_HWPT_PAGING can be also manually created via the IOMMU_HWPT_ALLOC
+ uAPI, provided an ioas_id via @pt_id to associate the new HWPT_PAGING object
+ to the corresponding IOAS object. The benefit of this manual allocation is to
+ provide allocation flags (defined in enum iommufd_hwpt_alloc_flags), e.g. it
+ allocates a nesting parent HWPT_PAGING, if the IOMMU_HWPT_ALLOC_NEST_PARENT
+ flag is set.
+
+4. IOMMUFD_OBJ_HWPT_NESTED can be only manually created via the IOMMU_HWPT_ALLOC
+ uAPI, provided an hwpt_id via @pt_id to associate the new HWPT_NESTED object
+ to the corresponding HWPT_PAGING object. The associating HWPT_PAGING object
+ must be a nesting parent manually allocated via the same uAPI previously with
+ an IOMMU_HWPT_ALLOC_NEST_PARENT flag, otherwise the allocation will fail. The
+ allocation will be further validated by the IOMMU driver of an IOMMU hardware
+ that the given device (via @dev_id) is physically linked to, to ensure that
+ the nesting parent domain and a nested domain being allocated are compatible.
+ Successful completion of this operation sets up linkages among IOAS, device,
+ and iommu_domains. Once this completes the device could do DMA via a 2-stage
+ translation, a.k.a nested translation. Note that multiple HWPT_NESTED objects
+ can be allocated by (and then associated to) the same nesting parent.

.. note::

- Future IOMMUFD updates will provide an API to create and manipulate the
- HW_PAGETABLE directly.
+ Either a manual IOMMUFD_OBJ_HWPT_PAGING or an IOMMUFD_OBJ_HWPT_NESTED is
+ created via the same IOMMU_HWPT_ALLOC uAPI. The difference is at the type
+ of the object passed in via the @pt_id field of struct iommufd_hwpt_alloc:
+ When @pt_id carries an ioas_id to an IOAS object, the IOMMU_HWPT_ALLOC
+ call is instructed to allocate an HWPT_PAGING object only.
+ When @pt_id carries an hwpt_id to an HWPT_PAGING object, the uAPI call
+ is instructed to allocate an HWPT_NESTED object only.
+ If any other type of object is passed in via the @pt_id, the uAPI call
+ will fail.

A device can only bind to an iommufd due to DMA ownership claim and attach to at
most one IOAS object (no support of PASID yet).
@@ -120,7 +166,8 @@ User visible objects are backed by following datastructures:

- iommufd_ioas for IOMMUFD_OBJ_IOAS.
- iommufd_device for IOMMUFD_OBJ_DEVICE.
-- iommufd_hw_pagetable for IOMMUFD_OBJ_HW_PAGETABLE.
+- iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING.
+- iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED.

Several terminologies when looking at these datastructures:

--
2.34.1