Re: [PATCH v2 04/10] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches
From: Alex Williamson
Date: Thu Mar 01 2018 - 16:22:08 EST
On Thu, 1 Mar 2018 18:54:01 +0000
"Stephen Bates" <sbates@xxxxxxxxxxxx> wrote:
> Thanks for the detailed review Bjorn!
>
> >>
> >> + Enabling this option will also disable ACS on all ports behind
> >> + any PCIe switch. This effictively puts all devices behind any
> >> + switch into the same IOMMU group.
>
> >
> > Does this really mean "all devices behind the same Root Port"?
>
> Not necessarily. You might have a cascade of switches (i.e switches below a switch) to achieve a very large fan-out (in an NVMe SSD array for example) and we will only disable ACS on the ports below the relevant switch.
>
> > What does this mean in terms of device security? I assume it means,
> > at least, that individual devices can't be assigned to separate VMs.
>
> This was discussed during v1 [1]. Disabling ACS on all downstream ports of the switch means that all the EPs below it have to part of the same IOMMU grouping. However it was also agreed that as long as the ACS disable occurred at boot time (which is does in v2) then the virtualization layer will be aware of it and will perform the IOMMU group formation correctly.
This is still a pretty terrible solution though, your kernel provider
needs to decide whether they favor device assignment or p2p, because we
can't do both, unless there's a patch I haven't seen yet that allows
boot time rather than compile time configuration. There are absolutely
supported device assignment cases of switches proving isolation between
devices allowing the downstream EPs to be used independently. I think
this is a non-starter for distribution support without boot time or
dynamic configuration. I could imagine dynamic configuration through
sysfs that might trigger a soft remove and rescan of the affected
devices in order to rebuild the IOMMU group. The hard part might be
determining which points to allow that to guarantee correctness. For
instance, upstream switch ports don't actually support ACS, but they'd
otherwise be an obvious concentration point to trigger a
reconfiguration. Thanks,
Alex