Re: timer_bh robusteness fix against potential deadlocks

From: Andrea Arcangeli (andrea@suse.de)
Date: Tue Jan 11 2000 - 20:49:56 EST


On Wed, 12 Jan 2000, Andrea Arcangeli wrote:

>I fixed the timer code to be robust against the bad scenario I discovered
>in the last days. The bad secnario consists in a timer that reinsert
>itself with an expire <= jiffies (or more precisely < timer_jiffies).

As TyTso suggested the problem could be ato to be zero. I checked and I
finally also spotted the offending bug that was causing the above
condition to happen (second chunk of the patch):

--- 2.2.14-tcp/net/ipv4/tcp_output.c.~1~ Fri Jan 7 18:19:25 2000
+++ 2.2.14-tcp/net/ipv4/tcp_output.c Wed Jan 12 02:47:32 2000
@@ -1004,7 +1004,7 @@
         unsigned long timeout;
 
         /* Stay within the limit we were given */
- timeout = tp->ato;
+ timeout = (tp->ato << 1) >> 1;
         if (timeout > max_timeout)
                 timeout = max_timeout;
         timeout += jiffies;
@@ -1044,6 +1044,8 @@
                          */
                         if(tcp_in_quickack_mode(tp))
                                 tcp_exit_quickack_mode(tp);
+ if (!tp->ato)
+ tp->ato = tp->rto;
                         tcp_send_delayed_ack(tp, HZ/2);
                         return;
                 }

An incoming synack doesn't carry any data into the packet so the
tcp_delack_estimator gets not recalled from tcp_ack, and the ato stays
zero. Then tcp_send_ack (the one we send to put the other end in
enstablished state) goes oom and queue the delack timer while ato is still
zero. Then the timer gets reinserted in the current queue from
run_timer_list and boom!

The fact an oom condition was necessary to trigger the bug, perfectly
explains why it wasn't reproducible in most machines.

        ftp://ftp.*.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.14/delack-timer-2.gz

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jan 15 2000 - 21:00:19 EST