Re: [PATCH rdma-next v1 0/2] Enable relaxed ordering for ULPs

From: Jason Gunthorpe
Date: Wed May 26 2021 - 15:30:43 EST


On Thu, May 20, 2021 at 01:13:34PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@xxxxxxxxxx>
>
> Changelog:
> v1:
> * Enabled by default RO in IB/core instead of changing all users
> v0: https://lore.kernel.org/lkml/20210405052404.213889-1-leon@xxxxxxxxxx
>
> >From Avihai,
>
> Relaxed Ordering is a PCIe mechanism that relaxes the strict ordering
> imposed on PCI transactions, and thus, can improve performance for
> applications that can handle this lack of strict ordering.
>
> Currently, relaxed ordering can be set only by user space applications
> for user MRs. Not all user space applications support relaxed ordering
> and for this reason it was added as an optional capability that is
> disabled by default. This behavior is not changed as part of this series,
> and relaxed ordering remains disabled by default for user space.
>
> On the other hand, kernel users should universally support relaxed
> ordering, as they are designed to read data only after observing the CQE
> and use the DMA API correctly. There are a few platforms with broken
> relaxed ordering implementation, but for them relaxed ordering is expected
> to be turned off globally in the PCI level. In addition, note that this is
> not the first use of relaxed ordering. Relaxed ordering has been enabled
> by default in mlx5 ethernet driver, and user space apps use it as well for
> quite a while.
>
> Hence, this series enabled relaxed ordering by default for kernel users so
> they can benefit as well from the performance improvements.
>
> The following test results show the performance improvement achieved
> with relaxed ordering. The test was performed by running FIO traffic
> between a NVIDIA DGX A100 (ConnectX-6 NICs and AMD CPUs) and a NVMe
> storage fabric, using NFSoRDMA:
>
> Without Relaxed Ordering:
> READ: bw=16.5GiB/s (17.7GB/s), 16.5GiB/s-16.5GiB/s (17.7GB/s-17.7GB/s),
> io=1987GiB (2133GB), run=120422-120422msec
>
> With relaxed ordering:
> READ: bw=72.9GiB/s (78.2GB/s), 72.9GiB/s-72.9GiB/s (78.2GB/s-78.2GB/s),
> io=2367GiB (2542GB), run=32492-32492msec
>
> The series has been tested over NVMe, iSER, SRP and NFS with ConnectX-6
> NIC. The tests included FIO verify and stress tests, and various
> resiliency tests (shutting down NIC port in the middle of traffic,
> rebooting the target in the middle of traffic etc.).

There was such a big discussion on the last version I wondered why
this was so quiet. I guess because the cc list isn't very big..

Adding the people from the original thread, here is the patches:

https://lore.kernel.org/linux-rdma/cover.1621505111.git.leonro@xxxxxxxxxx/

I think this is the general approach that was asked for, to special case
uverbs and turn it on in kernel universally

Jason