Re: [PATCH v4 6/7] net/tcp: Store SNEs + SEQs on ao_info
From: Dmitry Safonov
Date: Wed Nov 29 2023 - 17:12:27 EST
On 11/29/23 21:01, Eric Dumazet wrote:
> On Wed, Nov 29, 2023 at 8:58 PM Dmitry Safonov <dima@xxxxxxxxxx> wrote:
>> On 11/29/23 18:34, Eric Dumazet wrote:
[..]
>>> You have not commented on where these are read without the socket lock held ?
>>
>> Sorry for missing this, the SNEs are used with this helper
>> tcp_ao_compute_sne(), so these places are (in square brackets AFAICS,
>> there is a chance that I miss something obvious from your message):
>>
>> - tcp_v4_send_reset() => tcp_ao_prepare_reset() [rcu_read_lock()]
>> - __tcp_transmit_skb() => tcp_ao_transmit_skb() [TX softirq]
>> - tcp_v4_rcv() => tcp_inbound_ao_hash() [RX softirq]
>
> All these should/must have the socket lock held !
>
> Or reading tcp_sk(sk)->rcv_nxt would be racy anyway (note the lack of
> READ_ONCE() on it)
For fairness, post this patch rcv_next is not read anymore (SNEs are
updated in parallel).
> I think you need more work to make sure this is done correctly.
Sure.
> ie tcp_inbound_hash() should be called from tcp_v4_do_rcv() after the
> bh_lock_sock_nested() and sock_owned_by_user() checks.
But than my concern would be that any incoming segment will cause
contention for the time of signature verification. That potentially may
create DoS.
If this patch is ugly enough to be not acceptable, would
bh_lock_sock_nested() around reading SNEs + rcv_nxt/snd_una sound better?
Let me add some information, that is lacking in patch message, but may
be critical to avoid misunderstanding:
Note that the code doesn't need precise SEQ numbers, but it needs a
consistent SNE+SEQ pair to detect the moment of SEQ number rolling over.
So, that tcp_ao_compute_sne() will be able to use decremented SNE for a
delayed/retransmitted segment and to use incremented SNE for a new
segment post-rollover. So, technically, it just needs a correct SNE.
Which is computed based on what was "cached" SEQ for that "cached" SNE
and what is the SEQ from the skb.
As tcp window size is smaller than 2 GB, the valid segment to be
verified or signed won't be far away from this consistent number, that
is to be used by tcp_ao_compute_sne().
Technically, if the SNE+SEQ "cached" pair is inconsistent (which
unlikely but may happen _prior_ to this patch): i.e. SNE from
pre-rollover and SEQ is post-rollover, tcp_ao_compute_sne() will
incorrectly increment/decrement the SNE that is used for
signing/verification of the TCP segment. In result the segment will fail
verification and will be retransmitted again.
As it's unlikely race that may happen on SEQ rollover (once in 4GB) and
TCP-AO connection won't break, but survives after the retransmission, I
don't think it was noticed on testing.
Thanks,
Dmitry