Re: [PATCH net-next] net/smc: add support for netdevice in containers.

From: Niklas Schnelle
Date: Thu Sep 28 2023 - 11:06:00 EST


On Mon, 2023-09-25 at 10:35 +0800, Albert Huang wrote:
> If the netdevice is within a container and communicates externally
> through network technologies like VXLAN, we won't be able to find
> routing information in the init_net namespace. To address this issue,
> we need to add a struct net parameter to the smc_ib_find_route function.
> This allow us to locate the routing information within the corresponding
> net namespace, ensuring the correct completion of the SMC CLC interaction.
>
> Signed-off-by: Albert Huang <huangjie.albert@xxxxxxxxxxxxx>
> ---
> net/smc/af_smc.c | 3 ++-
> net/smc/smc_ib.c | 7 ++++---
> net/smc/smc_ib.h | 2 +-
> 3 files changed, 7 insertions(+), 5 deletions(-)
>

I'm trying to test this patch on s390x but I'm running into the same
issue I ran into with the original SMC namespace
support:https://lore.kernel.org/netdev/8701fa4557026983a9ec687cfdd7ac5b3b85fd39.camel@xxxxxxxxxxxxx/

Just like back then I'm using a server and a client network namespace
on the same system with two ConnectX-4 VFs from the same card and port.
Both TCP/IP traffic as well as user-space RDMA via "qperf … rc_bw" and
`qperf … rc_lat` work between namespaces and definitely go via the
card.

I did use "rdma system set netns exclusive" then moved the RDMA devices
into the namespaces with "rdma dev set <rdma_dev> netns <namespace>". I
also verified with "ip netns exec <namespace> rdma dev"
that the RDMA devices are in the network namespace and as seen by the
qperf runs normal RDMA does work.

For reference the smc_chck tool gives me the following output:

Server started on port 37373
[DEBUG] Interfaces to check: eno4378
Test with target IP 10.10.93.12 and port 37373
Live test (SMC-D and SMC-R)
[DEBUG] Running client: smc_run /tmp/echo-clt.x0q8iO 10.10.93.12 -p
37373
[DEBUG] Client result: TCP 0x05000000/0x03030000
Failed (TCP fallback), reasons:
Client: 0x05000000 Peer declined during handshake
Server: 0x03030000 No SMC devices found (R and D)

I also checked that SMC is generally working, once I add an ISM device
I do get SMC-D between the namespaces. Any ideas what could break SMC-R
here?

Thanks,
Niklas