Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

From: Alex Williamson
Date: Tue May 08 2018 - 18:32:13 EST

On Tue, 8 May 2018 16:10:19 -0600
Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:

> On 08/05/18 04:03 PM, Alex Williamson wrote:
> > If IOMMU grouping implies device assignment (because nobody else uses
> > it to the same extent as device assignment) then the build-time option
> > falls to pieces, we need a single kernel that can do both. I think we
> > need to get more clever about allowing the user to specify exactly at
> > which points in the topology they want to disable isolation. Thanks,
> Yeah, so based on the discussion I'm leaning toward just having a
> command line option that takes a list of BDFs and disables ACS for them.
> (Essentially as Dan has suggested.) This avoids the shotgun.
> Then, the pci_p2pdma_distance command needs to check that ACS is
> disabled for all bridges between the two devices. If this is not the
> case, it returns -1. Future work can check if the EP has ATS support, in
> which case it has to check for the ACS direct translated bit.
> A user then needs to either disable the IOMMU and/or add the command
> line option to disable ACS for the specific downstream ports in the PCI
> hierarchy. This means the IOMMU groups will be less granular but
> presumably the person adding the command line argument understands this.
> We may also want to do some work so that there's informative dmesgs on
> which BDFs need to be specified on the command line so it's not so
> difficult for the user to figure out.

I'd advise caution with a user supplied BDF approach, we have no
guaranteed persistence for a device's PCI address. Adding a device
might renumber the buses, replacing a device with one that consumes
more/less bus numbers can renumber the buses, motherboard firmware
updates could renumber the buses, pci=assign-buses can renumber the
buses, etc. This is why the VT-d spec makes use of device paths when
describing PCI hierarchies, firmware can't know what bus number will be
assigned to a device, but it does know the base bus number and the path
of devfns needed to get to it. I don't know how we come up with an
option that's easy enough for a user to understand, but reasonably
robust against hardware changes. Thanks,