Re: 2.6.18 BUG: unable to handle kernel NULL pointer dereference atvirtual address 000,0000a
From: Andrew Morton
Date: Sat Sep 23 2006 - 16:43:41 EST
cc's added. This looks quite serious.
On Sat, 23 Sep 2006 17:56:05 +0200
Christian Weiske <cweiske@xxxxxxxxxx> wrote:
> Hello,
>
>
> I have a reproducible BUG on my server that occurs whenever disk usage
> gets too high / too much swapping occurs (at least I think that is). The
> box has one reiserfs filesystem of about 187GB size, the disk is on an
> Epia 5000 board, between them is a Promise Ultra 100 PCI IDE controller
> card.
>
Do you think this bug is due to the 2.6.18 upgrade?
Have you run fsck across the filesystem(s)?
Does the oops always look the same as this one?
Please turn on the various CONFIG_DEBUG_* options, see if that turns up
anything.
It would be interesting to find out if enabling CONFIG_4KSTACKS makes this
go away (although I'm not sure why).
This looks more like a bug in the CPU scheduler than in the filesystem.
p->thread_info is NULL in scheduler_tick()'s first call to
set_tsk_need_resched(), at line 3008.
Thanks.
>
> Any hints about how to resolve this problem are very welcome.
>
>
> The trace from the serial console:
> -------------
> Oops: 0002 [#1]
> PREEMPT
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c0112a54>] Not tainted VLI
> EFLAGS: 00010013 (2.6.18 #1)
> EIP is at scheduler_tick+0x84/0x340
> eax: 00000002 ebx: c7eec590 ecx: c5e960d5 edx: 4457222b
> esi: c5e96100 edi: 0000002b ebp: c7f43864 esp: c7f43850
> ds: 007b es: 007b ss: 0068
> Process (pid: 6820, ti=c7f42000 task=c7eec590 task.ti=00000002)
> Stack: 00000000 c7eec590 c7eec590 00000000 00000000 c7f438d0 c0120c83
> c7f438d0
> 00000000 c010597b 00000000 c04fbe00 c013d785 00000000 00000000
> c7f438d0
> c056ea00 00000000 c04fbe00 c7f438d0 c013d833 00000000 c7f438d0
> c04fbe00
> Call Trace:
> [<c0120c83>] update_process_times+0x33/0x80
> [<c010597b>] timer_interrupt+0x3b/0x70
> [<c013d785>] handle_IRQ_event+0x35/0x70
> [<c013d833>] __do_IRQ+0x73/0x100
> [<c01047f5>] do_IRQ+0x25/0x50
> [<c0102e7a>] common_interrupt+0x1a/0x20
> [<c028300e>] _mmx_memcpy+0x6e/0x180
> [<c01b69f6>] leaf_copy_items+0x36/0x100
> [<c0282f1c>] memcpy+0x3c/0x50
> [<c0282f88>] memmove+0x38/0x50
> [<c01b72c5>] leaf_paste_in_buffer+0xa5/0x340
> [<c019fc4c>] balance_leaf+0x2cc/0x2e10
> [<c01af706>] get_parents+0x106/0x1a0
> [<c01a2ac1>] do_balance+0x61/0xf0
> [<c01b0d41>] wait_tb_buffers_until_unlocked+0x211/0x280
> [<c01b0f46>] fix_nodes+0x196/0x3d0
> [<c01bd3b6>] reiserfs_paste_into_item+0x196/0x1c0
> [<c01ab701>] reiserfs_allocate_blocks_for_region+0x971/0x13c0
> [<c01baea4>] search_for_position_by_key+0x134/0x330
> [<c013f6a6>] add_to_page_cache+0x46/0xc0
> [<c0162f92>] alloc_buffer_head+0x12/0x50
> [<c0160385>] alloc_page_buffers+0x65/0xc0
> [<c01a5606>] make_cpu_key+0x36/0x40
> [<c01b9b16>] pathrelse+0x26/0x40
> [<c01ad7a4>] reiserfs_file_write+0x694/0x720
> [<c01404f6>] __generic_file_aio_read+0x196/0x210
> [<c0140280>] file_read_actor+0x0/0xe0
> [<c012039c>] change_clocksource+0xc/0x140
> [<c0120b4d>] update_wall_time+0x18d/0x290
> [<c012b0c0>] autoremove_wake_function+0x0/0x40
> [<c0112c65>] scheduler_tick+0x295/0x340
> [<c015e254>] vfs_write+0x84/0x150
> [<c015e3cd>] sys_write+0x3d/0x70
> [<c0102c17>] syscall_call+0x7/0xb
> Code: da 8b 5d f0 01 4b 50 11 53 54 39 1d 04 5d 5a c0 89 35 f8 5c 5a c0
> 89 3d fc
> 5c 5a c0 74 12 a1 0c 5d 5a c0 39 43 30 74 1f 8b 43 04 <0f> ba 68 08 03
> 8d 65 f4
> 5b 5e 5f 5d c3 eb 0d 90 90 90 90 90 90
> EIP: [<c0112a54>] scheduler_tick+0x84/0x340 SS:ESP 0068:c7f43850
> <1>BUG: unable to handle kernel NULL pointer dereference at virtual
> address 000
> 0000a
> printing eip:
> c01123b2
> *pde = 00000000
> Oops: 0002 [#2]
> PREEMPT
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c01123b2>] Not tainted VLI
> EFLAGS: 00010097 (2.6.18 #1)
> EIP is at try_to_wake_up+0x52/0xb0
> eax: 00000002 ebx: cf79fa90 ecx: cf79fab8 edx: c7eec590
> esi: c05a5ce0 edi: 00000000 ebp: c7f436c8 esp: c7f436b8
> ds: 007b es: 007b ss: 0068
> Process (pid: 6820, ti=c7f42000 task=c7eec590 task.ti=00000002)
> Stack: 00000012 00000000 c04fbfcc 00000001 c7f436ec c0112d66 cf79fa90
> 00000001
> 00000000 00000000 c7f42000 00000000 00000012 c7f43714 c0112dc2
> c04fbfcc
> 00000001 00000001 00000000 00000000 000031f8 00000046 000031f8
> fffff5d8
> Call Trace:
> [<c0112d66>] __wake_up_common+0x36/0x70
> [<c0112dc2>] __wake_up+0x22/0x50
> [<c011786a>] release_console_sem+0xda/0x100
> [<c01175af>] vprintk+0x18f/0x2b0
> [<c01176b9>] vprintk+0x299/0x2b0
> [<c010323d>] show_stack_log_lvl+0x8d/0xb0
> [<c0112a68>] scheduler_tick+0x98/0x340
> [<c011740f>] printk+0xf/0x20
> [<c010ded3>] bust_spinlocks+0x43/0x50
> [<c0103575>] die+0x85/0x210
> [<c010e1c0>] do_page_fault+0x0/0x570
> [<c010e490>] do_page_fault+0x2d0/0x570
> [<c0112d66>] __wake_up_common+0x36/0x70
> [<c010e1c0>] do_page_fault+0x0/0x570
> [<c0102ec9>] error_code+0x39/0x40
> [<c0112a54>] scheduler_tick+0x84/0x340
> [<c0120c83>] update_process_times+0x33/0x80
> [<c010597b>] timer_interrupt+0x3b/0x70
> [<c013d785>] handle_IRQ_event+0x35/0x70
> [<c013d833>] __do_IRQ+0x73/0x100
> [<c01047f5>] do_IRQ+0x25/0x50
> [<c0102e7a>] common_interrupt+0x1a/0x20
> [<c028300e>] _mmx_memcpy+0x6e/0x180
> [<c01b69f6>] leaf_copy_items+0x36/0x100
> [<c0282f1c>] memcpy+0x3c/0x50
> [<c0282f88>] memmove+0x38/0x50
> [<c01b72c5>] leaf_paste_in_buffer+0xa5/0x340
> [<c019fc4c>] balance_leaf+0x2cc/0x2e10
> [<c01af706>] get_parents+0x106/0x1a0
> [<c01a2ac1>] do_balance+0x61/0xf0
> [<c01b0d41>] wait_tb_buffers_until_unlocked+0x211/0x280
> [<c01b0f46>] fix_nodes+0x196/0x3d0
> [<c01bd3b6>] reiserfs_paste_into_item+0x196/0x1c0
> [<c01ab701>] reiserfs_allocate_blocks_for_region+0x971/0x13c0
> [<c01baea4>] search_for_position_by_key+0x134/0x330
> [<c013f6a6>] add_to_page_cache+0x46/0xc0
> [<c0162f92>] alloc_buffer_head+0x12/0x50
> [<c0160385>] alloc_page_buffers+0x65/0xc0
> [<c01a5606>] make_cpu_key+0x36/0x40
> [<c01b9b16>] pathrelse+0x26/0x40
> [<c01ad7a4>] reiserfs_file_write+0x694/0x720
> [<c01404f6>] __generic_file_aio_read+0x196/0x210
> [<c0140280>] file_read_actor+0x0/0xe0
> [<c012039c>] change_clocksource+0xc/0x140
> [<c0120b4d>] update_wall_time+0x18d/0x290
> [<c012b0c0>] autoremove_wake_function+0x0/0x40
> [<c0112c65>] scheduler_tick+0x295/0x340
> [<c015e254>] vfs_write+0x84/0x150
> [<c015e3cd>] sys_write+0x3d/0x70
> [<c0102c17>] syscall_call+0x7/0xb
> Code: 3d 83 f8 02 74 63 a8 40 75 62 6a 01 56 53 e8 f6 fe ff ff 8b 45 10
> 83 c4 0c
> 85 c0 75 1c 8b 56 20 8b 42 1c 39 43 1c 7d 11 8b 42 04 <0f> ba 68 08 03
> 89 f6 8d
> bc 27 00 00 00 00 bf 01 00 00 00 c7 03
> EIP: [<c01123b2>] try_to_wake_up+0x52/0xb0 SS:ESP 0068:c7f436b8
> <0>Kernel panic - not syncing: Fatal exception in interrupt
> -------------
>
>
> # cat /proc/cpuinfo
> processor : 0
> vendor_id : CentaurHauls
> cpu family : 6
> model : 7
> model name : VIA Samuel 2
> stepping : 3
> cpu MHz : 533.373
> cache size : 64 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 1
> wp : yes
> flags : fpu de tsc msr cx8 mtrr pge mmx 3dnow
> bogomips : 1068.09
>
>
> # ./scripts/ver_linux
> If some fields are empty or look unusual you may have an old version.
> Compare to the current minimal requirements in Documentation/Changes.
>
> Linux dojo 2.6.18 #1 PREEMPT Sat Sep 23 16:24:51 Local time zone must be
> set--see i686 VIA Samuel 2 GNU/Linux
>
> Gnu C 3.4.6
> Gnu make 3.80
> binutils 2.16.1
> util-linux 2.12r
> mount 2.12r
> module-init-tools 3.2.1
> e2fsprogs 1.38
> reiserfsprogs 3.6.19
> Linux C Library 2.3.6
> Dynamic linker (ldd) 2.3.6
> Procps 3.2.6
> Net-tools 1.60
> Kbd 1.12
> Sh-utils 5.94
> udev 087
> Modules Loaded
>
>
>
>
> Please CC me as I am not subscribed.
>
> --
> Regards/MfG,
> Christian Weiske
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/