[patch 38/47] tcp: Fix inconsistency source (CA_Open only when!tcp_left_out(tp))

From: Greg KH
Date: Fri Jun 13 2008 - 20:25:01 EST


-stable review patch. If anyone has any objections, please let us know.

------------------
From: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx>

[ upstream commit: 8aca6cb1179ed9bef9351028c8d8af852903eae2 ]

It is possible that this skip path causes TCP to end up into an
invalid state where ca_state was left to CA_Open while some
segments already came into sacked_out. If next valid ACK doesn't
contain new SACK information TCP fails to enter into
tcp_fastretrans_alert(). Thus at least high_seq is set
incorrectly to a too high seqno because some new data segments
could be sent in between (and also, limited transmit is not
being correctly invoked there). Reordering in both directions
can easily cause this situation to occur.

I guess we would want to use tcp_moderate_cwnd(tp) there as well
as it may be possible to use this to trigger oversized burst to
network by sending an old ACK with huge amount of SACK info, but
I'm a bit unsure about its effects (mainly to FlightSize), so to
be on the safe side I just currently fixed it minimally to keep
TCP's state consistent (obviously, such nasty ACKs have been
possible this far). Though it seems that FlightSize is already
underestimated by some amount, so probably on the long term we
might want to trigger recovery there too, if appropriate, to make
FlightSize calculation to resemble reality at the time when the
losses where discovered (but such change scares me too much now
and requires some more thinking anyway how to do that as it
likely involves some code shuffling).

This bug was found by Brian Vowell while running my TCP debug
patch to find cause of another TCP issue (fackets_out
miscount).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>

---
net/ipv4/tcp_input.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2469,6 +2469,20 @@ static inline void tcp_complete_cwr(stru
tcp_ca_event(sk, CA_EVENT_COMPLETE_CWR);
}

+static void tcp_try_keep_open(struct sock *sk)
+{
+ struct tcp_sock *tp = tcp_sk(sk);
+ int state = TCP_CA_Open;
+
+ if (tcp_left_out(tp) || tp->retrans_out || tp->undo_marker)
+ state = TCP_CA_Disorder;
+
+ if (inet_csk(sk)->icsk_ca_state != state) {
+ tcp_set_ca_state(sk, state);
+ tp->high_seq = tp->snd_nxt;
+ }
+}
+
static void tcp_try_to_open(struct sock *sk, int flag)
{
struct tcp_sock *tp = tcp_sk(sk);
@@ -2482,15 +2496,7 @@ static void tcp_try_to_open(struct sock
tcp_enter_cwr(sk, 1);

if (inet_csk(sk)->icsk_ca_state != TCP_CA_CWR) {
- int state = TCP_CA_Open;
-
- if (tcp_left_out(tp) || tp->retrans_out || tp->undo_marker)
- state = TCP_CA_Disorder;
-
- if (inet_csk(sk)->icsk_ca_state != state) {
- tcp_set_ca_state(sk, state);
- tp->high_seq = tp->snd_nxt;
- }
+ tcp_try_keep_open(sk);
tcp_moderate_cwnd(tp);
} else {
tcp_cwnd_down(sk, flag);
@@ -3296,8 +3302,11 @@ no_queue:
return 1;

old_ack:
- if (TCP_SKB_CB(skb)->sacked)
+ if (TCP_SKB_CB(skb)->sacked) {
tcp_sacktag_write_queue(sk, skb, prior_snd_una);
+ if (icsk->icsk_ca_state == TCP_CA_Open)
+ tcp_try_keep_open(sk);
+ }

uninteresting_ack:
SOCK_DEBUG(sk, "Ack %u out of %u:%u\n", ack, tp->snd_una, tp->snd_nxt);

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/