Re: [RFC PATCH v2 0/3] l3mdev icmp error route lookup fixes

From: Michael Jeanson
Date: Wed Sep 23 2020 - 15:12:20 EST


On 2020-09-23 14 h 46, David Ahern wrote:
On 9/23/20 11:03 AM, Michael Jeanson wrote:
On 2020-09-23 12 h 04, Michael Jeanson wrote:
It should work without asymmetric routing; adding the return route to
the second vrf as I mentioned above fixes the FRAG_NEEDED problem. It
should work for TTL as well.

Adding a second pass on the tests with the return through r2 is fine,
but add a first pass for the more typical case.

Hi,

Before writing new tests I just want to make sure we are trying to fix
the same issue. If I add a return route to the red VRF then we don't
need this patchset because whether the ICMP error are routed using the
table from the source or destination interface they will reach the
source host.

The issue for which this patchset was sent only happens when the
destination interface's VRF doesn't have a route back to the source
host. I guess we might question if this is actually a bug or not.

So the question really is, when a packet is forwarded between VRFs
through route leaking and an icmp error is generated, which table
should be used for the route lookup? And does it depend on the type of
icmp error? (e.g. TTL=1 happens before forwarding, but fragmentation
needed happens after when on the destination interface)

As a side note, I don't mind reworking the tests as you requested even
if the patchset as a whole ends up not being needed and if you think
they are still useful. I just wanted to make sure we understood each other.


if you are leaking from VRF 1 to VRF 2 and you do not configure VRF 2
with how to send to errors back to source - MTU or TTL - then I will
argue that is a configuration problem, not a bug.

Now the TTL problem is interesting. You need the FIB lookup to know that
the packet is forwarded, and by the time of the ttl check in ip_forward
skb->dev points to the ingress VRF and dst points to the egress VRF. So
I think the change is warranted.

Let's do this for the tests:
1 pass through all of the tests (TTL and MTU, v4 and v6) with symmetric
routing configured and make sure they all pass. ie., keep all of them
and make sure all tests pass. No sense losing the tests and the thoughts
behind them.

Add a second pass with the asymmetric routing per the customer setup
since it motivated the investigation.

Rename the test to something like vrf_route_leaking.sh. It can be
expanded with more tests related to route leaking as they come up.


Just a final clarification, the asymmetric setup would have no return route in VRF 2 and only test the TTL case since the others would fail?