Re: [PATCH v6 0/3] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag
From: Casey Leedom
Date: Mon Jul 10 2017 - 20:02:10 EST
Hey Alexander,
Okay, I understand your point regarding the "most likely scenario" being
TLPs directed upstream to the Root Complex. But I'd still like to make sure
that we have an agreed upon API/methodology for doing Peer-to-Peer with
Relaxed Ordering and no Relaxed Ordering to the Root Complex. I don't see
how the proposed APIs can be used in that fashion.
Right now the proposed change for cxgb4 is for it to test its own PCIe
Capability Device Control[Relaxed Ordering Enable] in order to use that
information to program the Chelsio Hardware to emit/not emit upstream TLPs
with the Relaxed Ordering Attribute set. But if we're going to have the
mixed mode situation I describe, the PCIe Capability Device Control[Relaxed
Ordering Enable] will have to be set which means that we'll be programming
the Chelsio Hardware to send upstream TLPs with Relaxed Ordering Enable to
the Root Complex which is what we were trying to avoid in the first place ...
[[ And, as I noted on Friday evening, the currect cxgb4 Driver hardwires
the Relaxed Ordering Enable on early dureing device probe, so that
would minimally need to be addressed even if we decide that we don't
ever want to support mixed mode Relaxed Ordering. ]]
We need some method of telling the Chelsio Driver that it should/shouldn't
use Relaxed Ordering with TLPs directed at the Root Complex. And the same
is true for a Peer PCIe Device.
It may be that we should approach this from the completely opposite
direction and instead of having quirks which identify problematic devices,
have quirks which identify devices which would benefit from the use of
Relaxed Ordering (if the sending device supports that). That is, assume the
using Relaxed Ordering shouldn't be done unless the target device says "I
love Relaxed Ordering TLPs" ... In such a world, an NVMe or a Graphics
device might declare love of Relaxed Ordering and the same for a SPARC Root
Complex (I think that was your example).
By the way, the sole example of Data Corruption with Relaxed Ordering is
the AMD A1100 ARM SoC and AMD appears to have given up on that almost as
soon as it was released. So what we're left with currently is a performance
problem on modern Intel CPUs ... (And hopefully we'll get a Technical
Publication on that issue fairly soon.)
Casey