Re: [RFC PATCH net-next] net/smc: Introduce receive queue flow control support

From: Stefan Raspl
Date: Tue Jan 25 2022 - 04:48:57 EST


On 1/20/22 07:51, Guangguan Wang wrote:
This implement rq flow control in smc-r link layer. QPs
communicating without rq flow control, in the previous
version, may result in RNR (reveive not ready) error, which
means when sq sends a message to the remote qp, but the
remote qp's rq has no valid rq entities to receive the message.
In RNR condition, the rdma transport layer may retransmit
the messages again and again until the rq has any entities,
which may lower the performance, especially in heavy traffic.
Using credits to do rq flow control can avoid the occurrence
of RNR.

That's some truly substantial improvements!
But we need to be careful with protocol-level changes: There are other operating systems like z/OS and AIX which have compatible implementations of SMC, too. Changes like a reduction of connections per link group or usage of reserved fields would need to be coordinated, and likely would have unwanted side-effects even when used with older Linux kernel versions.
Changing the protocol is "expensive" insofar as it requires time to thoroughly discuss the changes, perform compatibility tests, and so on.
So I would like to urge you to investigate alternative ways that do not require protocol-level changes to address this scenario, e.g. by modifying the number of completion queue elements, to see if this could yield similar results.

Thx!