Re: [PATCH v21 18/24] ovpn: add support for peer floating

From: Antonio Quartulli
Date: Mon Mar 10 2025 - 08:58:32 EST


On 07/03/2025 11:12, Sabrina Dubroca wrote:
2025-03-06, 11:02:50 +0100, Antonio Quartulli wrote:
On 05/03/2025 17:56, Sabrina Dubroca wrote:
2025-03-05, 14:14:36 +0100, Antonio Quartulli wrote:
On 05/03/2025 12:20, Sabrina Dubroca wrote:
2025-03-05, 00:19:32 +0100, Antonio Quartulli wrote:
On 04/03/2025 19:37, Sabrina Dubroca wrote:
2025-03-04, 01:33:48 +0100, Antonio Quartulli wrote:
+void ovpn_peer_endpoints_update(struct ovpn_peer *peer, struct sk_buff *skb)
+{
+ struct hlist_nulls_head *nhead;
+ struct sockaddr_storage ss;
+ const u8 *local_ip = NULL;
+ struct sockaddr_in6 *sa6;
+ struct sockaddr_in *sa;
+ struct ovpn_bind *bind;
+ size_t salen = 0;
+
+ spin_lock_bh(&peer->lock);
+ bind = rcu_dereference_protected(peer->bind,
+ lockdep_is_held(&peer->lock));
+ if (unlikely(!bind))
+ goto unlock;
+
+ switch (skb->protocol) {
+ case htons(ETH_P_IP):
+ /* float check */
+ if (unlikely(!ovpn_bind_skb_src_match(bind, skb))) {
+ if (bind->remote.in4.sin_family == AF_INET)
+ local_ip = (u8 *)&bind->local;

If I'm reading this correctly, we always reuse the existing local
address when we have to re-create the bind, even if it doesn't match
the skb? The "local endpoint update" chunk below is doing that, but
only if we're keeping the same remote? It'll get updated the next time
we receive a packet and call ovpn_peer_endpoints_update.

That might irritate the RPF check on the other side, if we still use
our "old" source to talk to the new dest?

+ sa = (struct sockaddr_in *)&ss;
+ sa->sin_family = AF_INET;
+ sa->sin_addr.s_addr = ip_hdr(skb)->saddr;
+ sa->sin_port = udp_hdr(skb)->source;
+ salen = sizeof(*sa);
+ break;

I think the issue is simply this 'break' above - by removing it, everything
should work as expected.

Only if the bind was of the correct family? Checking an IPv4 local
address (in the bind) against an IPv6 source address in the packet (or
the other way around) isn't going to work well.

Ah I understand what you mean.

The purpose of "local_ip" is to provide a working local endpoint to be used
with the new remote address.
However, if the float is switching family we can't re-use the same old local
endpoint (hence the check).
In this case we'll learn the "new" local address later.

Does it make sense?

Sure, but we could have learned it immediately from the packet we just
got, whether we're changing family or not. No need to wait for the
next RX packet to also learn the new local address.

Indeed.


But if we now do a dst_cache_reset with the peer float,
ovpn_udp*_output will have to do a new route/local address lookup and
I guess that should clean up the local address stored in the bind, and
then update the dst_cache with the local address we just found.

Right and this may not truly be what we want.

If peer X is sending packets to our IP1, we should at least try to reply
from the same address.

If we have two IPs, IP1 and IP2, and both can be used to reach peer X, we
should always try to use the one where we received traffic from X in the
first place.

I had a thought that it might not be our prefered address to talk to
X, but it would probably be, since we decided to use it (and thus X
used it as remote to talk to us).

I am not sure I follow this sentence: I think you are just confirming what I said above (please correct me if I am wrong)?


OTOH hand it is also true that with floating detection on both sides, the
situation will converge quickly, but there might be a reason why X chose IP1
as destination, therefore we should do our best to respect that.

And I guess the primary reason for X to choose IP1 would be "we sent
packets to X from IP1".

Probably. It truly depends on who initiated the connection.


So, even in case of float, we should still store the local endpoint and
attempt fetching a route that takes that into consideration.
Which I think is what is happening (assuming we reset the dst_cache on
float).

Not at the same time as float, unless ovpn_peer_endpoints_update sets
local_ip = ip_hdr(skb)->daddr unconditionally on float?

Otherwise the next route lookup in ovpn_udpX_output will pick whatever
source address it wants (which would likely match what's in the
received skb during float, so probably fine anyway).


But that's what the code just below in ovpn_peer_endpoints_update() does, no?


223 /* local endpoint update */
224 if (unlikely(bind->local.ipv4.s_addr != ip_hdr(skb)->daddr)) {

...

229 bind->local.ipv4.s_addr = ip_hdr(skb)->daddr;


ovpn_udpX_output() will:
* get no rt from the cache
* possibly confirm that saddr is ok
* fetch the new rt using the provided saddr and daddr
* update the cache.

That makes sense to me.
Would you agree?

With dst_cache reset on float, yes. As long as we have that, the main
behavior seems correct to me. (maybe some corner cases will not be
handled optimally, but that can be improved later - which is most
likely what I've been discussing in these emails :))

Yeah :)


[this could be a useful counter to add in the future: number of floats
and local address updates - so the user can check if that's increasing
"too often", which would indicate something weird is happening]

ACK, good idea!

Thanks!

Ok, I'll probably wait a little more and then prepare v22.

Cheers,


--
Antonio Quartulli
OpenVPN Inc.