Re: [PATCH] iommu: Relax ACS requirement for RCiEP devices.
From: Alex Williamson
Date: Tue May 05 2020 - 11:34:24 EST
On Tue, 5 May 2020 07:56:06 -0700
"Raj, Ashok" <ashok.raj@xxxxxxxxx> wrote:
> On Tue, May 05, 2020 at 08:05:14AM -0600, Alex Williamson wrote:
> > On Mon, 4 May 2020 23:11:07 -0700
> > "Raj, Ashok" <ashok.raj@xxxxxxxxx> wrote:
> >
> > > Hi Alex
> > >
> > > + Joerg, accidently missed in the Cc.
> > >
> > > On Mon, May 04, 2020 at 11:19:36PM -0600, Alex Williamson wrote:
> > > > On Mon, 4 May 2020 21:42:16 -0700
> > > > Ashok Raj <ashok.raj@xxxxxxxxx> wrote:
> > > >
> > > > > PCIe Spec recommends we can relax ACS requirement for RCIEP devices.
> > > > >
> > > > > PCIe 5.0 Specification.
> > > > > 6.12 Access Control Services (ACS)
> > > > > Implementation of ACS in RCiEPs is permitted but not required. It is
> > > > > explicitly permitted that, within a single Root Complex, some RCiEPs
> > > > > implement ACS and some do not. It is strongly recommended that Root Complex
> > > > > implementations ensure that all accesses originating from RCiEPs
> > > > > (PFs and VFs) without ACS capability are first subjected to processing by
> > > > > the Translation Agent (TA) in the Root Complex before further decoding and
> > > > > processing. The details of such Root Complex handling are outside the scope
> > > > > of this specification.
> > > > >
> > > >
> > > > Is the language here really strong enough to make this change? ACS is
> > > > an optional feature, so being permitted but not required is rather
> > > > meaningless. The spec is also specifically avoiding the words "must"
> > > > or "shall" and even when emphasized with "strongly", we still only have
> > > > a recommendation that may or may not be honored. This seems like a
> > > > weak basis for assuming that RCiEPs universally honor this
> > > > recommendation. Thanks,
> > > >
> > >
> > > We are speaking about PCIe spec, where people write it about 5 years ahead
> > > and every vendor tries to massage their product behavior with vague
> > > words like this.. :)
> > >
> > > But honestly for any any RCiEP, or even integrated endpoints, there
> > > is no way to send them except up north. These aren't behind a RP.
> >
> > But they are multi-function devices and the spec doesn't define routing
> > within multifunction packages. A single function RCiEP will already be
> > assumed isolated within its own group.
>
> That's right. The other two devices only have legacy PCI headers. So
> they can't claim to be RCiEP's but just integrated endpoints. The legacy
> devices don't even have a PCIe header.
>
> I honestly don't know why these are groped as MFD's in the first place.
>
> >
> > > I did check with couple folks who are part of the SIG, and seem to agree
> > > that ACS treatment for RCiEP's doesn't mean much.
> > >
> > > I understand the language isn't strong, but it doesn't seem like ACS should
> > > be a strong requirement for RCiEP's and reasonable to relax.
> > >
> > > What are your thoughts?
> >
> > I think hardware vendors have ACS at their disposal to clarify when
> > isolation is provided, otherwise vendors can submit quirks, but I don't
> > see that the "strongly recommended" phrasing is sufficient to assume
> > isolation between multifunction RCiEPs. Thanks,
>
> You point is that integrated MFD endpoints, without ACS, there is no
> gaurantee to SW that they are isolated.
>
> As far as a quirk, do you think:
> - a cmdline optput for integrated endpoints, and RCiEP's suffice?
> along with a compile time default that is strict enforcement
> - typical vid/did type exception list?
>
> A more generic way to ask for exception would be scalable until we can stop
> those type of integrated devices. Or we need to maintain these device lists
> for eternity.
I don't think the language in the spec is anything sufficient to handle
RCiEP uniquely. We've previously rejected kernel command line opt-outs
for ACS, and the extent to which those patches still float around the
user community and are blindly used to separate IOMMU groups are a
testament to the failure of this approach. Users do not have a basis
for enabling this sort of opt-out. The benefit is obvious in the IOMMU
grouping, but the risk is entirely unknown. A kconfig option is even
worse as that means if you consume a downstream kernel, the downstream
maintainers might have decided universally that isolation is less
important than functionality.
I think the only solution is that the hardware vendors need to step up
to indicate where devices are isolated. The hardware can do this
itself by implementing ACS, otherwise we need quirks. I think we've
also generally been reluctant to accept quirks that provide a blanket
opt-out for a vendor because doing so is akin to trying to predict the
future (determining the behavior of all current and previous hardware
is generally a sufficiently impossible task already). Perhaps if a
vendor has a published internal policy regarding RCiEP isolation and is
willing to stand by a quirk, there might be room to negotiate. Thanks,
Alex