Re: [PATCH 2/2] iommu: add warning when sharing groups

From: Alex Williamson
Date: Wed Feb 15 2017 - 14:36:34 EST

On Tue, 14 Feb 2017 22:53:35 -0500
okaya@xxxxxxxxxxxxxx wrote:

> On 2017-02-14 18:51, Alex Williamson wrote:
> > On Tue, 14 Feb 2017 16:25:22 -0500
> > Sinan Kaya <okaya@xxxxxxxxxxxxxx> wrote:
> >
> >> The ACS requirement has been obscured in the current code and is only
> >> known by certain individuals who happen to read the code. Print out a
> >> warning with ACS path failure when ACS requirement is not met.
> >>
> >> Signed-off-by: Sinan Kaya <okaya@xxxxxxxxxxxxxx>
> >> ---
> >> drivers/iommu/iommu.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >> index dbe7f65..049ee0a 100644
> >> --- a/drivers/iommu/iommu.c
> >> +++ b/drivers/iommu/iommu.c
> >> @@ -811,6 +811,9 @@ struct iommu_group *pci_device_group(struct device
> >> *dev)
> >> if (IS_ERR(group))
> >> return NULL;
> >>
> >> + if (pci_is_root_bus(bus))
> >> + dev_warn_once(&pdev->dev, "using shared group due to ACS path
> >> failure\n");
> >> +
> >> return group;
> >> }
> >>
> >
> > The premise here is flawed. An IOMMU group based at the root bus
> > doesn't necessarily imply a lack of ACS. There are devices on root
> > buses, integrated endpoints and root ports. Naturally an IOMMU group
> > for these devices needs to be based at the root bus. Additionally,
> > there can be IOMMU groups developed around a lack of ACS that don't
> > intersect with the root bus. Since this is a warn_once, the false
> > positives for root bus devices are going to be enumerated first. On an
> > Intel system there's typically a device as 00.0 that will always be
> > pointlessly listed first. Also, it's not clear that grouping devices
> > together is always wrong, as Robin pointed out in the EHCI/OHCI
> > example. Lack of ACS on downtream ports is likely to cause problems,
> > especially if that downstream port exposes a slot. Maybe that would be
> > a good place to start. Also, what is someone supposed to do when they
> > see this error? If we can hope they'll look for the error in the code
> > (unlikely) a big comment with useful external links might be
> > necessary. Based on how easily vendors ignore kernel warnings, I'm
> > dubious there's any value to this path though. Thanks,
> Maybe, a better solution would be to add some sentences into vfio.txt
> documentation.
> I'm ready to drop this patch. I just don't want ACS requirement to be
> hidden between the source code.
> Would you be willing to do that?
> I know I read all pci and vfio documents in the past. I could have
> captured this requirement if it was there.

We already have this:

This isolation is not always at the granularity of a single device
though. Even when an IOMMU is capable of this, properties of devices,
interconnects, and IOMMU topologies can each reduce this isolation.
For instance, an individual device may be part of a larger multi-
function enclosure. While the IOMMU may be able to distinguish
between devices within the enclosure, the enclosure may not require
transactions between devices to reach the IOMMU. Examples of this
could be anything from a multi-function PCI device with backdoors
between functions to a non-PCI-ACS (Access Control Services) capable
bridge allowing redirection without reaching the IOMMU. Topology
can also play a factor in terms of hiding devices. A PCIe-to-PCI
bridge masks the devices behind it, making transaction appear as if
from the bridge itself. Obviously IOMMU design plays a major factor
as well.

Additionally if you google for "iommu group", this is the first hit:

This talks extensively about ACS. A few hits below that you can find a
presentation I've given with ACS examples. What additional
documentation do you think would have helped you discover or understand
this problem earlier?

I agree that device isolation is not a spec requirement. The specs
give us the tools that we need, but valid uses cases exist where a lack
of isolation may be preferred. If we logically deduce how we can give
a device or set of devices to a user for an untrusted environment,
isolated from other devices, I think it's pretty logical to come to the
conclusion that ACS is the only way that PCIe hardware can allow that
sort of control in a standard way. Clearly we also recognize that this
is a commonly overlooked area where hardware vendors may fail to
incorporate this subtly into their platform design guidelines and thus
we have numerous quirks for exposing virtual ACS-like isolation. The
hope is that adding a quirk for this devices means that feedback was
provided to the hardware teams and system architects within the
companies developing these devices to consider this use case and
implement native ACS in the next generation. Thanks,