[FIXED] Re: Total machine lockup w/ current kernels while installing from CD

From: Bernhard Rosenkraenzer
Date: Mon May 15 2006 - 16:31:46 EST


On Thursday, 11. May 2006 03:22, Bernhard Rosenkraenzer wrote:
> Hi,
> I've built a CD that installs a customized system
> [... crash at a random point ...]
> BUG: soft lockup detected on CPU#0!
>
> Pid: 421, comm: kjournald
> EIP: 0060:[<b01a2f52>] CPU: 0
> EIP is at journal_commit_transaction+0x92e/0xfcc
> EFLAGS: 00000297 Not tainted (2.6.16-rc6 #1)
> EAX: 00000001 EBX: c2d34788 ECX: 00000001 EDX: c785e000
> ESI: b3ff8d04 EDI: 000000f0 EBP: b683b840 DS: 007b ES: 007b
> CR0: 8005003b CR2: 0841f7fc CR3: 17217000 CR4: 000006d0
> [<b02bd52e>] schedule+0x2ee/0x5b6
> [<b01a6a88>] kjournald+0x201/0x213
> [<b0111089>] smp_apic_timer_interrupt+0x32/0x49
> [<b01a6937>] kjournald+0xb0/0x213
> [<b01a5ffa>] commit_timeout+0x0/0x9
> [<b012a789>] autoremove_wake_function+0x0/0x4b
> [<b01a6887>] kjournald+0x0/0x213
> [<b0101005>] kernel_thread_helper+0x5/0xb

After backing out lots of changes, I've figured out the problem is caused by
this bit of 2.6.16-rc6:

diff -urN linux-2.6.16-rc5/kernel/sched.c linux-2.6.16-rc6/kernel/sched.c
--- linux-2.6.16-rc5/kernel/sched.c 2006-05-11 20:04:18.000000000 +0200
+++ linux-2.6.16-rc6/kernel/sched.c 2006-05-11 20:00:00.000000000 +0200
@@ -4028,6 +4021,8 @@
*/
if (unlikely(preempt_count()))
return;
+ if (unlikely(system_state != SYSTEM_RUNNING))
+ return;
do {
add_preempt_count(PREEMPT_ACTIVE);
schedule();


The problem is that (to save a couple of bits of space), my simple installer
was running inside an initrd -- and system_state isn't set to SYSTEM_RUNNING
before linuxrc is executed --> scheduler breakage causes the oops.

Since initrds are deprecated anyway, I've used this opportunity to update the
installer to use initramfs instead, and as expected, the problem went away --
nevertheless [unless initrds are being removed in 2.6.18 anyway] we should
probably fix this or at least document it somewhe.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/