Re: ftp server crashes on heavy load: possible scheduler bug

From: Andrew Morton
Date: Fri Apr 29 2005 - 07:10:56 EST


"Pedro Venda (SYSADM)" <pjvenda@xxxxxxxxxxxxxx> wrote:
>
> We've made some changes on our ftp server, and since that it's been crashing
> frequently (everyday) with a kernel panic.
>
> We've configured the 5 IDE 160GB drives into md raid5 arrays with LVM on top
> of that. All filesystems are reiserfs. The other change we made to the server
> was changing from a patched 2.6.10-ac12 kernel into a newer 2.6.11.7.
>
> Not being able to see the whole stacktrace on screen, we've started a
> netconsole to investigate. Started the server and loaded it pretty bad with
> rsyncs and such... until it crashed after just 20 minutes.
>
> The netconsole log was surprising - "kernel BUG at kernel/sched.c:2634!"

Strange. It'd be interesting to try disabling CONFIG_4KSTACKS. Also,
please add this to get a bit more info.

diff -puN kernel/sched.c~a kernel/sched.c
--- 25/kernel/sched.c~a 2005-04-29 05:05:24.792004408 -0700
+++ 25-akpm/kernel/sched.c 2005-04-29 05:06:36.015176840 -0700
@@ -2631,7 +2631,12 @@ void fastcall add_preempt_count(int val)
/*
* Underflow?
*/
- BUG_ON(((int)preempt_count() < 0));
+ if ((int)preempt_count() < 0) {
+ printk("preempt_count=%d\n", preempt_count());
+ BUG();
+ }
+ if ((int)preempt_count() > 1000)
+ printk("preempt_count=%d\n", preempt_count());
preempt_count() += val;
/*
* Spinlock count overflowing soon?
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/