Re: [2.2.13aa6 (bugfix release II) ]

From: David S. Miller (davem@redhat.com)
Date: Mon Jan 10 2000 - 14:43:40 EST


   Date: Mon, 10 Jan 2000 16:11:28 +0100 (CET)
   From: Andrea Arcangeli <andrea@suse.de>

   tcp_delack_timer doesn't really guarantee that, see:

Ok, thanks for catching this.

   If the quickack bit is set while calling tcp_send_delayed_ack the kernel
   will lockup immediatly in a way that matches the reports from ursus.

How? Always in such a case, timeout > max_timeout because this bit is
set and the values are unsigned.

   The reason for the deadlock is that the expired field of the timer will be
   set in the past and so the timer will reinsert inself in the first heap
   slot and so it will continue to reinsert and rexecute it in an infinite
   loop -> soft deadlock.

Not if the timeout>max_timeout test passes, which I think it will.

   I'd like to also make the timer code robust against
   these kind of subsystem bugs later but actually I am only focused to fix
   the offending code in TCP.

Agreed, so I want your fix to go in anyways.

But I do want to discuss where the error comes from and
why the timeout>max_timeout test does not prevent it.

   And really tp->delayed_acks is meaningless as far I can tell and the right
   thing to do is to remove the delayed_acks field completly (this must be
   done at least in 2.3.x). Removing it will avoid wasting time in the TCP
   code and will decrease half of the the delacks code braindamage :).

It's already done in the patch sets I've been feeding Linus for 2.3.x
It has died in our sources, it will be no more :-)

Later,
David S. Miller
davem@redhat.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jan 15 2000 - 21:00:16 EST