[PATCH v3 00/22] iommu/amd: Introduce AMD Hardware-accelerated Virtualized IOMMU (vIOMMU) Support

From: Suravee Suthikulpanit

Date: Mon Jun 29 2026 - 11:54:30 EST


AMD IOMMU introduces the vIOMMU feature, which provides partial hardware
acceleration when implementing Guest IOMMUs. This feature provides
acceleration for guest Command Buffer, Event Log, and PPR Log. This
eliminates the CPU overhead needed for the supporting HV intercepts and
reduces the latency of these operations.

When a guest attempts to access guest IOMMU MMIO registers with offsets
between 8KB and 12KB (i.e. the 3rd 4K region) such as the Command Buffer,
Event Log and PPR Log head and tail pointer registers, this is serviced
directly by the IOMMU. When the IOMMU accesses a Command Buffer, PPR Log
or a COMPLETION_WAIT store location in memory, it directly accesses guest
physical memory. The HV/VMM continues to trap and emulate the IOMMU
configuration MMIO registers between 0KB and 4KB (i.e. the 1st 4K
region), which are primarily used during initialization.

Additionally, the HV must initialize the vIOMMU feature, map MMIO resources
between the VMs and the IOMMU, manage additional supporting data structures
in memory (e.g. GPA->SPA translation DTE, Device ID and Domain ID mapping
tables), and allocate/map vIOMMU Private Address region used as backing
storage memory for the IOMMU. Support for new IOMMU command and events
specifically for vIOMMU are also added.

Guest IOMMUs are IOMMUs exposed to VMs with additional support from VMM
(QEMU) to generate guest ACPI IVRS table and define guest PCI topology for
IOMMU and pass-through VFIO devices, which are not covered by this series.

For more detail, please see the vIOMMU section of the AMD IOMMU
Specification[1].

This is version 3 of the AMD HW-vIOMMU series. It is implemented on top of
the IOMMUFD vIOMMU, vDevice, and nested-domain framework in Linux
v7.1.0-rc4.

The series is organized into the following subsets:

Patch 1-3 : Preparatory patches
Patch 4-8 : Introduce IOMMUFD vIOMMU support and VF MMIO setup
Patch 9-14 : Introduce and map vIOMMU Private Address (IPA) region
Patch 15-18 : Introduce IOMMUFD vDevice support for AMD
Patch 19-22 : Translate-device-ID pool and per-vIOMMU translation DTE

Changes since v2:
(https://lore.kernel.org/linux-iommu/20260528051738.596013-1-suravee.suthikulpanit@xxxxxxx/)

Series scope:
* Reduced from 26 to 22 patches by deferring hw_queue, extended interrupt
remapping ioctl, and vIOMMU hardware-enable patches to a follow-on
series.

IOMMUFD vIOMMU support
* Patch 4: update struct iommu_viommu_amd to include vfmmio_mmap_offset
field.
* Patch 7: Fix casting when calling iommufd_viommu_alloc_mmap().

IPA / DTE infrastructure (patches 9-14):
* Merged domain creation and private-region alloc/map into one
patch (9). Added preparatory patches to pass iommu explicitly to
device_flush_dte(), export amd_iommu_alloc_dev_data(), and pass
iommu/devid to amd_iommu_make_clear_dte(). IOMMU IPA DTE assignment
(13) now allocates viommu_dev_data and uses the standard DTE update
path.
* Patch 9:
- In viommu_priv_alloc_map_flush(), clarify the need for
set_memory_uc() for vIOMMU backing storage memory, and Remove
unnecessary amd_iommu_flush_private_vm_region().
- In viommu_private_space_uninit(), add logic to properly flush after
unmap.
* Patch 14: Remove unnecessary struct iommu_domain_ops::iotlb_sync.

IOMMUFD vDevice support (patches 15-18):
* Patch 15: Call amd_viommu_uninit_one() before freeing the gid.

Translate device ID (TransDevID) - design change (patches 19-22):
* Driver allocates one trans_devid per IOMMUFD vIOMMU instance.
Each vIOMMU owns its translation DTE independently.
* Remove kvmfd so multiple vIOMMU instances for one VM share one
GPA->SPA translation DTE.
* struct iommu_viommu_amd no longer carries kvmfd; KVM FD handling
removed entirely from this series.

Translation DTE helpers (patch 21):
* Allocates iommu_dev_data per trans_devid, programs through
amd_iommu_set_dte_v1() / amd_iommu_update_dte(), and frees on clear.
clone_aliases and PCI alias flush are skipped for trans_devid entries
without a struct device.

Testing done:
* Single/Multiple vIOMMU instances
* Single/Multiple VFIO devices per vIOMMU instance.

[1] IOMMU Specification: https://docs.amd.com/v/u/en-US/48882_3.11_IOMMU_PUB
[2] Linux git tree: https://github.com/AMDESE/linux-iommu/tree/linux-7.1.0-rc4-amd-viommu_upstream_v3

Thank you,
Suravee

Suravee Suthikulpanit (22):
iommu/amd: Make amd_iommu_completion_wait() non-static
iommu/amd: Introduce vIOMMU-specific events and event
iommu/amd: Detect and initialize AMD vIOMMU feature
iommu/amd: Introduce IOMMUFD vIOMMU support for AMD
iommu/amd: Allocate Guest IDs for IOMMUFD vIOMMU instances
iommu/amd: Map vIOMMU VF and VF Control MMIO BARs
iommu/amd: Add support for AMD vIOMMU VF MMIO region
iommu/amd: Introduce Reset vMMIO Command
iommu/amd: Introduce and map vIOMMU private IPA region
iommu/amd: Pass iommu to device_flush_dte()
iommu/amd: Export amd_iommu_alloc_dev_data() helper
iommu/amd: Pass iommu and devid to amd_iommu_make_clear_dte()
iommu/amd: Assign IOMMU Private Address domain to IOMMU
iommu/amd: Add per-VM private IPA alloc/map helpers
iommu/amd: Add helper functions to manage DevID / DomID mapping tables
iommu/amd: Introduce IOMMUFD vDevice support for AMD
iommu/amd: Introduce helper function for updating domain ID mapping
table
iommu/amd: Introduce helper function for updating device ID mapping
table
iommu/amd: Add per-segment translate device ID pool
iommu/amd: Reserve translate-device-id for PCI requestor aliases
iommu/amd: Add translation DTE and VFctrl TransDevID helpers
iommu/amd: Assign per-vIOMMU translate device ID

drivers/iommu/amd/Makefile | 2 +-
drivers/iommu/amd/amd_iommu.h | 51 ++-
drivers/iommu/amd/amd_iommu_types.h | 100 +++++
drivers/iommu/amd/amd_viommu.h | 58 +++
drivers/iommu/amd/init.c | 23 +-
drivers/iommu/amd/iommu.c | 292 +++++++++++---
drivers/iommu/amd/iommufd.c | 120 ++++++
drivers/iommu/amd/nested.c | 19 +-
drivers/iommu/amd/trans_devid.c | 186 +++++++++
drivers/iommu/amd/viommu.c | 566 ++++++++++++++++++++++++++++
include/uapi/linux/iommufd.h | 10 +
11 files changed, 1374 insertions(+), 53 deletions(-)
create mode 100644 drivers/iommu/amd/amd_viommu.h
create mode 100644 drivers/iommu/amd/trans_devid.c
create mode 100644 drivers/iommu/amd/viommu.c

--
2.34.1