Re: [PATCH v2 04/19] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
From: Nicolin Chen
Date: Fri Sep 27 2024 - 02:03:05 EST
On Fri, Sep 27, 2024 at 01:38:08PM +0800, Yi Liu wrote:
> > > Does it mean each vIOMMU of VM can only have
> > > one s2 HWPT?
> >
> > Giving some examples here:
> > - If a VM has 1 vIOMMU, there will be 1 vIOMMU object in the
> > kernel holding one S2 HWPT.
> > - If a VM has 2 vIOMMUs, there will be 2 vIOMMU objects in the
> > kernel that can hold two different S2 HWPTs, or share one S2
> > HWPT (saving memory).
>
> So if you have two devices assigned to a VM, then you may have two
> vIOMMUs or one vIOMMU exposed to guest. This depends on whether the two
> devices are behind the same physical IOMMU. If it's two vIOMMUs, the two
> can share the s2 hwpt if their physical IOMMU is compatible. is it?
Yes.
> To achieve the above, you need to know if the physical IOMMUs of the
> assigned devices, hence be able to tell if physical IOMMUs are the
> same and if they are compatible. How would userspace know such infos?
My draft implementation with QEMU does something like this:
- List all viommu-matched iommu nodes under /sys/class/iommu: LINKs
- Get PCI device's /sys/bus/pci/devices/0000:00:00.0/iommu: LINK0
- Compare the LINK0 against the LINKs
We so far don't have an ID for physical IOMMU instance, which can
be an alternative to return via the hw_info call, otherwise.
QEMU then does the routing to assign PCI buses and IORT (or DT).
This part is suggested now to move to libvirt though. So, I think
at the end of the day, libvirt would run the sys check and assign
a device to the corresponding pci bus backed by the correct IOMMU.
This gives an example showing two devices behind iommu0 and third
device behind iommu1 are assigned to a VM:
-device pxb-pcie.id=pcie.viommu0,bus=pcie.0.... \ # bus for viommu0
-device pxb-pcie.id=pcie.viommu1,bus=pcie.0.... \ # bus for viommu1
-device pcie-root-port,id=pcie.viommu0p0,bus=pcie.viommu0... \
-device pcie-root-port,id=pcie.viommu0p1,bus=pcie.viommu0... \
-device pcie-root-port,id=pcie.viommu1p0,bus=pcie.viommu1... \
-device vfio-pci,bus=pcie.viommu0p0... \ # connect to bus for viommu0
-device vfio-pci,bus=pcie.viommu0p1... \ # connect to bus for viommu0
-device vfio-pci,bus=pcie.viommu1p0... # connect to bus for viommu1
For compatibility to share a stage-2 HWPT, basically we would do
a device attach to one of the stage-2 HWPT from the list that VMM
should keep. This attach has all the compatibility test, down to
the IOMMU driver. If it fails, just allocate a new stage-2 HWPT.
Thanks
Nic