Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

From: Alex Williamson
Date: Tue May 08 2018 - 19:12:07 EST


On Tue, 8 May 2018 22:25:06 +0000
"Stephen Bates" <sbates@xxxxxxxxxxxx> wrote:

> > Yeah, so based on the discussion I'm leaning toward just having a
> > command line option that takes a list of BDFs and disables ACS
> > for them. (Essentially as Dan has suggested.) This avoids the
> > shotgun.
>
> I concur that this seems to be where the conversation is taking us.
>
> @Alex - Before we go do this can you provide input on the approach? I
> don't want to re-spin only to find we are still not converging on the
> ACS issue....

I can envision numerous implementation details that makes this less
trivial than it sounds, but it seems like the thing we need to decide
first is if intentionally leaving windows between devices with the
intention of exploiting them for direct P2P DMA in an otherwise IOMMU
managed address space is something we want to do. From a security
perspective, we already handle this with IOMMU groups because many
devices do not support ACS, the new thing is embracing this rather than
working around it. It makes me a little twitchy, but so long as the
IOMMU groups match the expected worst case routing between devices,
it's really no different than if we could wipe the ACS capability from
the device.

On to the implementation details... I already mentioned the BDF issue
in my other reply. If we had a way to persistently identify a device,
would we specify the downstream points at which we want to disable ACS
or the endpoints that we want to connect? The latter has a problem
that the grouping upstream of an endpoint is already set by the time we
discover the endpoint, so we might need to unwind to get the grouping
correct. The former might be more difficult for users to find the
necessary nodes, but easier for the kernel to deal with during
discovery. A runtime, sysfs approach has some benefits here,
especially in identifying the device assuming we're ok with leaving
the persistence problem to userspace tools. I'm still a little fond of
the idea of exposing an acs_flags attribute for devices in sysfs where
a write would do a soft unplug and re-add of all affected devices to
automatically recreate the proper grouping. Any dynamic change in
routing and grouping would require all DMA be re-established anyway and
a soft hotplug seems like an elegant way of handling it. Thanks,

Alex