Re: [RFC] RDMA/core: Fix IPv6 loopback dst MAC address lookup logic

From: Jason Gunthorpe
Date: Thu Nov 21 2024 - 08:53:47 EST

On Thu, Nov 21, 2024 at 05:22:36PM +0800, Zelong Yue wrote:
> Gently ping. Do I need to provide more detailed information on how to
> reproduce the issue?
> On 11/10/24 8:35 PM, yuezelong wrote:
> > Imagine we have two RNICs on a single machine, named eth1 and eth2, with
> >
> > - IPv4 addresses:,
> > - IPv6 addresses (scope global): fdbd::beef:2, fdbd::beef:3
> > - MAC addresses: 11:11:11:11:11:02, 11:11:11:11:11:03,
> >
> > they all connnected to a gateway with MAC address 22:22:22:22:22:02.
> >
> > If we want to setup connections between these two RNICs, with RC QP, we
> > would go through `rdma_resolve_ip` for looking up dst MAC addresses. The
> > procedure it's the same as using command
> >
> > `ip route get dst_addr from src_addr oif src_dev`
> >
> > In IPv4 scenario, you would likely get
> >
> > ```
> > $ ip route get from oif eth2
> >
> > from via dev eth2 ...
> > ```
> >
> > Looks reasonable as it would go through the gateway.
> >
> > But in IPv6 scenario, you would likely get
> >
> > ```
> > $ ip route get fdbd::beef:2 from fdbd::beef:3 oif eth2
> >
> > local fdbd::beef:2 from fdbd::beed:3 dev lo table local proto kernel src fdbd::beef:2 metric 0 pref medium
> > ```
> >
> > This would lead to the RDMA route lookup procedure filling the dst MAC
> > address with src net device's MAC address (11:11:11:11:11:03), but
> > filling the dst IP address with dst net device's IPv6 address
> > (fdbd::beef:2), src net device would drop this packet, and we would fail
> > to setup the connection.
> >
> > To make setting up loopback connections like this possible, we need to
> > send packets to the gateway and let the gateway send it back (actually,
> > the IPv4 lookup result would lead to this, so there is no problem in IPv4
> > scenario), so we need to adjust current lookup procedure, if we find out
> > the src device and dst device is on the same machine (same namespace),
> > we need to send the packets to the gateway instead of the src device
> > itself.

We can't just override the routing like this, if you want that kind of
routing you need to setup the routing table to deliver it. For ipv4
these configurations almost always come with policy routing
configurations that avoid returning lo as a route. I assume ipv6 is
the same.

I'm not sure why your ipv4 example doesn't use lo either, by default
it should have. It suggests to me there is alread some routing
overrides present.
