Re: 2.6.18-rc5-mm1

From: Andrew Morton
Date: Sun Sep 03 2006 - 20:48:06 EST


On Sun, 3 Sep 2006 17:22:17 -0700
"Pallipadi, Venkatesh" <venkatesh.pallipadi@xxxxxxxxx> wrote:

>
>
> >-----Original Message-----
> >From: Andrew Morton [mailto:akpm@xxxxxxxx]
> >Sent: Friday, September 01, 2006 6:30 PM
> >To: Matthias Hentges
> >Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx;
> >Pallipadi, Venkatesh
> >Subject: Re: 2.6.18-rc5-mm1
> >
> >On Sat, 02 Sep 2006 03:00:47 +0200
> >Matthias Hentges <oe@xxxxxxxxxxx> wrote:
> >
> >> 2.6.18-rc5-mm1 oopses on an Asus P5W DH Deluxe board, full dmesg
> >> attached.
> >> This did not happen in 2.6.18-rc4-mm3.
> >>
> >>
> >> BUG: unable to handle kernel NULL pointer dereference at
> >virtual address
> >> 00000000
> >> printing eip:
> >> 00000000
> >> *pde = 00000000
> >> Oops: 0000 [#1]
> >> 4K_STACKS SMP
> >> last sysfs file:
> >> Modules linked in:
> >> CPU: 0
> >> EIP: 0060:[<00000000>] Not tainted VLI
> >> EFLAGS: 00010087 (2.6.18-rc5-mm1 #1)
> >> EIP is at rest_init+0x3feffd78/0x20
> >> eax: 000000da ebx: c04d5f78 ecx: c04d5f94 edx: c04d2f00
> >> esi: 000000da edi: 00000000 ebp: c04d2f00 esp: c0516ffc
> >> ds: 007b es: 007b ss: 0068
> >> Process swapper (pid: 0, ti=c0516000 task=c045c200 task.ti=c04d5000)
> >> Stack: c0105027
> >> Call Trace:
> >> [<c0105027>] do_IRQ+0x8a/0xac
> >> [<c01035a6>] common_interrupt+0x1a/0x20
> >> [<c0101a72>] mwait_idle_with_hints+0x36/0x3b
> >> [<c0101a83>] mwait_idle+0xc/0x1b
> >> [<c0101a26>] cpu_idle+0x5e/0x74
> >> [<c04db6fa>] start_kernel+0x363/0x36a
> >> =======================
> >> Code: Bad EIP value.
> >> EIP: [<00000000>] rest_init+0x3feffd78/0x20 SS:ESP 0068:c0516ffc
> >> <0>Kernel panic - not syncing: Fatal exception in interrupt
> >> BUG: warning at arch/i386/kernel/smp.c:547/smp_call_function()
> >> [<c010ca45>] smp_call_function+0x54/0xff
> >> [<c011a270>] printk+0x12/0x16
> >> [<c010cb03>] smp_send_stop+0x13/0x1c
> >> [<c0119480>] panic+0x49/0xd3
> >> [<c010410c>] die+0x273/0x28a
> >> [<c01126d4>] do_page_fault+0x40d/0x4db
> >> [<c01122c7>] do_page_fault+0x0/0x4db
> >> [<c03d1231>] error_code+0x39/0x40
> >> [<c013007b>] free_module+0x89/0xc3
> >> [<c0105027>] do_IRQ+0x8a/0xac
> >> [<c01035a6>] common_interrupt+0x1a/0x20
> >> [<c0101a72>] mwait_idle_with_hints+0x36/0x3b
> >> [<c0101a83>] mwait_idle+0xc/0x1b
> >> [<c0101a26>] cpu_idle+0x5e/0x74
> >> [<c04db6fa>] start_kernel+0x363/0x36a
> >> =======================
> >
> >OK, thanks. That'll be acpi-mwait-c-state-fixes.patch. I've
> >uploaded the
> >below revert patch to
> >ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2
> >.6.18-rc5/2.6.18-rc5-mm1/hot-fixes/
> >
>
> Andrew,
>
> As this patch doesn't seem to be the issue here, can you un-revert the
> patch in mm...
>

Spose so.

But what _did_ cause it? Looks like we took an IRQ and then leapt into
outer space, when do_IRQ() called desc->handle_irq().

Matthias, could you please test with CONFIG_4KSTACKS=n?

Also, one cause of this might be a module which fails to clean up when it's
removed. And the trace indicates that some module has previously
been unloaded. Can you work out which module(s) that might be?


--
VGER BF report: H 0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/