Re: No one wants to help me :-(

From: Brian Gerst (bgerst@didntduck.org)
Date: Fri Apr 13 2001 - 13:08:00 EST


Mircea Damian wrote:
>
> Hello,
>
> I was expecting to receive some replies to my last desperate messages:
> linux-kernel@vger.kernel.org/msg35446.html">http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg35446.html
> linux-kernel@vger.kernel.org/msg36591.html">http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg36591.html
>
> My machine is dyeing in add_timer(). It seems to happen only on SMP
> machines and is something related to the network driver. For some reason
> one of the timer lists gets broken so we (we are two people trying to
> solve this issue) wrote a "safe" timer.c which tries to rebuild the chain
> in case it hits a NULL pointer.
>
> The machine is (ofcourse) slower with this patch but at least it works.
>
> Maybe someone can see which is the real bug and fix it.
>
> Please help!

I found (at least part of) the problem. In detach_timer() we test if
the timer is pending. If it is not the function does not remove the
timer from the list and returns 0. The functions that call
detach_timer() do not check the return value and unconditionally set the
list pointers to NULL, even though the timer is still on the list.
Patch against 2.4.3 attached, but there may be a better solution.

-- 

Brian Gerst

diff -urN linux-2.4.3/kernel/timer.c linux/kernel/timer.c --- linux-2.4.3/kernel/timer.c Thu Dec 14 20:52:22 2000 +++ linux/kernel/timer.c Fri Apr 13 13:26:08 2001 @@ -194,6 +194,7 @@ if (!timer_pending(timer)) return 0; list_del(&timer->list); + timer->list.next = timer->list.prev = NULL; return 1; } @@ -217,7 +218,6 @@ spin_lock_irqsave(&timerlist_lock, flags); ret = detach_timer(timer); - timer->list.next = timer->list.prev = NULL; spin_unlock_irqrestore(&timerlist_lock, flags); return ret; } @@ -246,7 +246,6 @@ spin_lock_irqsave(&timerlist_lock, flags); ret += detach_timer(timer); - timer->list.next = timer->list.prev = 0; running = timer_is_running(timer); spin_unlock_irqrestore(&timerlist_lock, flags); @@ -309,7 +308,6 @@ data= timer->data; detach_timer(timer); - timer->list.next = timer->list.prev = NULL; timer_enter(timer); spin_unlock_irq(&timerlist_lock); fn(data);

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 15 2001 - 21:00:21 EST