2024-11-14, 11:32:36 +0100, Antonio Quartulli wrote:
On 13/11/2024 12:05, Sabrina Dubroca wrote:
2024-11-12, 15:26:59 +0100, Antonio Quartulli wrote:
On 11/11/2024 16:41, Sabrina Dubroca wrote:
2024-10-29, 11:47:31 +0100, Antonio Quartulli wrote:
+void ovpn_peer_hash_vpn_ip(struct ovpn_peer *peer)
+ __must_hold(&peer->ovpn->peers->lock)
Changes to peer->vpn_addrs are not protected by peers->lock, so those
could be getting updated while we're rehashing (and taking peer->lock
in ovpn_nl_peer_modify as I'm suggesting above also wouldn't prevent
that).
/me screams :-D
Sorry :)
Indeed peers->lock is only about protecting the lists, not the content of
the listed objects.
How about acquiring the peers->lock before calling ovpn_nl_peer_modify()?
It seems like it would work. Maybe a bit weird to have conditional
locking (MP mode only), but ok. You already have this lock ordering
(hold peers->lock before taking peer->lock) in
ovpn_peer_keepalive_work_mp, so there should be no deadlock from doing
the same thing in the netlink code.
Yeah.
Then I would also do that in ovpn_peer_float to protect that rehash.
I am not extremely comfortable with this, because it means acquiring
peers->lock on every packet (right now we do so only on peer->lock) and it
may defeat the advantage of the RCU locking on the hashtables.
Wouldn't you agree?
Hmpf, yeah. Then I think you could keep most of the current code,
except doing the rehash under both locks (peers + peer), and get
ss+sa_len for the rehash directly from peer->bind (instead of using
the ones we just defined locally in ovpn_peer_float, since they may
have changed while we released peer->lock to grab peers->lock). We may
end up "rehashing" twice into the same bucket if we have 2 concurrent
peer_float calls (call 1 sets remote r1, call 2 sets a new one r2,
call 1 hashes according to r2, call 2 also rehashes based on r2). That
should be ok (it can happen anyway that a "real" rehash lands in the
same bucket).
peer_float {
spin_lock(peer)
match/update bind
spin_unlock(peer)
if (MP) {
spin_lock(peers)
spin_lock(peer)
rehash using peer->bind->remote rather than ss
spin_unlock(peer)
spin_unlock(peers)
}
}
Does that sound reasonable?