Re: [PATCH v6 0/3] Add new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag

From: Casey Leedom
Date: Wed Jul 12 2017 - 20:52:40 EST


  Sorry again for the delay.  This time at least partially caused by a Chelsio-internal Customer Support request to simply disable Relaxed Ordering entirely due to the performance issues with our 100Gb/s product and relatively recent Intel Root Complexes. Our Customer Support people are tired of asking customers to try turning off Relaxed Ordering. (sigh)

So, first off, I've mentioned a couple of times that the current cxgb4 driver hardwired the PCIe Capability Device Control[Relaxed Ordering Enable] on. Here's the code which does it:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4657:

static void enable_pcie_relaxed_ordering(struct pci_dev *dev)
{
pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN);
}

This is called from the PCI Probe() routine init_one() later in that file. I just wanted to make sure people knew about this. Obviously given our current very difficult thread, this would either need to be diked out or changed or a completely different mechanism put in place.

Second, just to make sure everyone's on the same page, the above simply allows the device to send TLPs with the Relaxed Ordering Attribute. It doesn't cause TLPs to suddenly all be sent with RO set. The use of Relaxed Ordering is selective. For instance, in our hardware we can configure the RX Path to use RO on Ingress Packet Data delivery to Free List Buffers, but not use RO for delivery of messages noting newly delivered Ingress Packet Data. Doing this allows the destination PCIe target to [potentially] optimize the DMA Writes to it based on local conditions (memory controller channel availability, etc.), but ensure that the message noting newly delivered Ingress Packet Data isn't processed till all of the preceding TLPs with RO set containing Ingress Packet Data have been processed. (This by the way is the essence of the AMD A1100 ARM SoC bug: its Root Complex isn't obeying that PCIe ordering rule.)

Third, as noted above, I'm getting a lot of pressure to get this addressed sooner than later, so I think that we should go with something fairly simple along the lines that you guys are proposing and I'll stop whining about the problem of needing to handle Peer-to-Peer with Relaxed Ordering while not using it for deliveries to the Root Complex. We can just wait for that kettle of fish to explode on us and deal with the mess then. (Hhmmm, the mixed metaphor landed in an entirely different place than I originally intended ... :-))

If we try to stick as closely to Ding's latest patch set as possible, then we can probably just add the diff to remove the enable_pcie_relaxed_ordering() code in cxgb4_main.c.

Casey