Anyway, this would indeed have gotten worse in 1.3.73, because that started
using more of the delayed ack logic (the code isn't new per se, it's just used
in a new way). That could have made the problem much worse.
Anyway, one way to test out this theory would be to check out "tcp_queue()" in
tcp_input.c, and find the place where it does something like this:
if (!sk->delay_acks || th->fin) {
tcp_send_ack(...
}
else
{
...
Now, what happens if you change the test to use the "tcp_send_ack()" path a lot
more, and not use the "else" path as much (the else path does the delayed ack
thing, and if it is messed up the retransmission code..)
I'd suggest changing the "if" case to something like this:
if (!sk->delay_acks || th->fin ||
sk->bytes_rcv > sk->max_unacked ||
sk->ato > HZ/2 ||
tcp_raise_window(sk)) {
tcp_send_ack(...
}
else
{
(That is essentially what 1.3.72 uses, and while I don't particularly like the
extra cases it might help to be overly cautious in this case..)
If you do test this, please remove any other patches to your system (ie try the
above on a clean 1.3.84 or 85 kernel), so that I don't need to worry about lots
of different patches..
Linus