On 11/5/21 07:04, Leonard Crestez wrote:
On 11/5/21 3:22 AM, Dmitry Safonov wrote:[..]
I remember we discussed it in RFC, that removing a key that's currently
in use may result in random MKT to be used.
I think, it's possible to make this API a bit more predictable if:
- DEL command fails to remove a key that is current/receive_next;
- opt.flags has CURR/NEXT flag that has corresponding `u8 current_key`
and `u8 receive_next` values. As socket lock is held - that makes
current_key/receive_next change atomic with deletion of an existing key
that might have been in use.
In result user may remove a key that's not in use or has to set new
current/next. Which avoids the issue with random MKT being used to sign
segments.
The MKT used to sign segments is already essentially random unless the
user makes a deliberate choice. This is what happens if you add two keys
an call connect(). But why is this a problem?
The issue is predictability and less control for a user on how the key
is selected.
Let's say as a user I have two MKTs A and B. I want to use A for 6 weeks
and then change to B. I want to switch to B as soon as the admin of the
peer adds the key and the peer sends me (rnext_key = B.id).
With your semantics currently a random key will be used as long as I
don't "lock" the id which means that rnext_key won't be respected.
So there's clearly less predictability for a user to select current key
in use.
Applications which want to deliberately control the send key can do so
with TCP_AUTHOPT_FLAG_LOCK_KEYID. If that flag is not set then the key
with send_id == recv_rnextkeyid is preffered as suggested by the RFC, or
a random one on connect.
I think your suggestion would force additional complexity on all
applications for no clear gain.
I disagree. From RFC (3.1):
"It is presumed that an MKT affecting a particular connection cannot be
destroyed during an active connection -- or, equivalently, that its
parameters are copied to an area local to the connection (i.e.,
instantiated) and so changes would affect only new connections."
which means that the user shouldn't be able to remove a key in use.
So, by default you should return an error if the key in use being deleted.
The only use-case to delete a key that is in use is if it has been
compromised RFC(6.1):
"Deciding when to start using a key is a performance issue. Deciding
when to remove an MKT is a security issue. Invalid MKTs are expected to
be removed. TCP-AO provides no mechanism to coordinate their removal, as
we consider this a key management operation."
I might misread the RFC, but it seems that shouldn't happen in an
ordinary usage scenario (as long as the user don't --force removal of
the compromised key in an exceptional case).
So, if you allow a user to set current_key/rnext_key atomically with
removal - it seems to fit this --force use-case and let user more
control over which key is in use.
Key selection controls are only added much later in the series, this is
also part of the effort to split the code into readable patches. See
this patch:
https://lore.kernel.org/netdev/2dc569c0d60c80c26aafcaa201ba5b5ec53ce6bd.1635784253.git.cdleonard@xxxxxxxxx/
A separate issue with that one (if I'm not misreading) seems to be that
you're going to send segments with info->send_rnextkeyid if the deleted
key was TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID one.
And won't be able to verify the peer inbound segments/replies.
Removing a key while traffic is happening shouldn't cause failures in
recv or send code; this takes some effort but is also required to
prevent auth failures when a socket is closed and transitions to
timewait. I attempted to ensure this by only doing rcu_dereference for
tcp_authopt_info and tcp_authopt_key_info once per packet.