Re: [RFC PATCH net-next] net/smc: Introduce receive queue flow control support

From: Guangguan Wang
Date: Fri Jan 28 2022 - 22:43:31 EST



On 2022/1/25 17:42, Stefan Raspl wrote:
>
> That's some truly substantial improvements!
> But we need to be careful with protocol-level changes: There are other operating systems like z/OS and AIX which have compatible implementations of SMC, too. Changes like a reduction of connections per link group or usage of reserved fields would need to be coordinated, and likely would have unwanted side-effects even when used with older Linux kernel versions.
> Changing the protocol is "expensive" insofar as it requires time to thoroughly discuss the changes, perform compatibility tests, and so on.
> So I would like to urge you to investigate alternative ways that do not require protocol-level changes to address this scenario, e.g. by modifying the number of completion queue elements, to see if this could yield similar results.
>
> Thx!
>

Yes, there are alternative ways, as RNR caused by the missmatch of send rate and receive rate, which means sending too fast
or receiving too slow. What I have done in this patch is to backpressure the sending side when sending too fast.

Another solution is to process and refill the receive queue as quickly as posibble, which requires no protocol-level change.
The fllowing modifications are needed:
- Enqueue cdc msgs to backlog queues instead of processing in rx tasklet. llc msgs remain unchanged.
- A mempool is needed as cdc msgs are processed asynchronously. Allocate new receive buffers from mempool when refill receive queue.
- Schedule backlog queues to other cpus, which are calculated by 4-tuple or 5-tuple hash of the connections, to process the cdc msgs,
in order to reduce the usage of the cpu where rx tasklet runs on.

the pseudocode shows below:
rx_tasklet
if cdc_msgs
enqueue to backlog;
maybe smp_call_function_single_async is needed to wakeup the corresponding cpu to process backlog;
allocate new buffer and modify the sge in rq_wr;
else
process remains unchanged;
endif

post_recv rq_wr;
end rx_tasklet

smp_backlog_process in corresponding cpu, called by smp_call_function_single_async
for connections hashed to this cpu
for cdc_msgs in backlog
process cdc msgs;
end cdc_msgs
end connections
end smp_backlog_process

I‘d like to hear your suggestions of this solution.
Thank you.