Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

From: Don Dutile
Date: Wed May 09 2018 - 11:52:10 EST

On 05/09/2018 10:44 AM, Alex Williamson wrote:
On Wed, 9 May 2018 12:35:56 +0000
"Stephen Bates" <sbates@xxxxxxxxxxxx> wrote:

Hi Alex and Don

Correct, the VM has no concept of the host's IOMMU groups, only
the hypervisor knows about the groups,

But as I understand it these groups are usually passed through to VMs
on a pre-group basis by the hypervisor? So IOMMU group 1 might be
passed to VM A and IOMMU group 2 passed to VM B. So I agree the VM is
not aware of IOMMU groupings but it is impacted by them in the sense
that if the groupings change the PCI topology presented to the VM
needs to change too.

Hypervisors don't currently expose any topology based on the grouping,
the only case where such a concept even makes sense is when a vIOMMU is
present as devices within the same group cannot have separate address
spaces. Our options for exposing such information is also limited, our
only real option would seem to be placing devices within the same group
together on a conventional PCI bus to denote the address space
granularity. Currently we strongly recommend singleton groups for this
case and leave any additional configuration constraints to the admin.

The case you note of a group passed to VM A and another passed to VM B
is exactly an example of why any sort of dynamic routing change needs to
have the groups fully released, such as via hot-unplug. For instance,
a routing change at a shared node above groups 1 & 2 could result in
the merging of these groups and there is absolutely no way to handle
that with portions of the group being owned by two separate VMs after
the merge. Thanks,


The above is why I stated the host/HV has to do p2p setup *before* device-assignment
is done.
Now, that could be done at boot time (with a mod.conf-like config in host/HV, before VM startup)
as well.
Dynamically, if such a feature is needed, requires a hot-unplug/plug cycling as Alex states.