Re: ASPLOV miss ratio porting to planet labs kernel

From: Sizhao Yang
Date: Fri Jul 15 2005 - 13:14:17 EST


Hi Marcelo,

Thank you for the quick response. I actually didn't anticipate that
quick of a response.

> > I was wondering if someone could help me with this. I'm porting an
> > ASPLOV paper miss ratio curve from 2.4.20 2.6.11.6 and eventually to
> > Planet Labs kernel. It's a novel idea for memory management. In
> > porting I at run time I'm consistently hitting kernel bugs at four
> > different places bad_page, bad_range, in rmap.c
> > BUG(page_mapcount(page)< 0), and failing at apm_do_idle. All of these
> > functions except apm_do_idle seem to be new functions from 2.4.20 to
> > 2.6.11.6. I'm pretty sure I'm forgetting to account for certain
> > things when modifying the pages, but I'm not sure where.
>
> Having the information which bad_page etc. dump out would definately help.
>
> I can't figure out what is going on with the data you provide, probably
> someone else can.

Here are some data dumps that I have: for bad_page, bad_range, and rmap.c
error 1: When this function has an error it traces back to bad_range
and bad_page.
------------[ cut here ]------------
kernel BUG at mm/page_alloc.c:647!
invalid operand: 0000 [#1]
Modules linked in:
CPU: 0
EIP: 0060:[<c0137af8>] Not tainted VLI
EFLAGS: 00000202 (2.6.11-kgdb)
EIP is at buffered_rmqueue+0x178/0x1d0
eax: 00000001 ebx: c033c164 ecx: fff99e0b edx: 00001000
esi: c033c164 edi: 00000246 ebp: c033c180 esp: c7db5df0
ds: 007b es: 007b ss: 0068
Process nbench (pid: 301, threadinfo=c7db4000 task=c1273020)
Stack: c033c190 c7e3b000 c033c178 000080d2 00000000 c033c164 00000000 00000000
000080d2 c013800d 00000001 00000000 00000000 c11657e0 00000001 00000000
c1273020 00000010 c033c3cc 00000000 c03e0aa0 003ba025 c7dbffd0 c12f385c
Call Trace:
[<c013800d>] __alloc_pages+0x3fd/0x460
[<c0142e19>] do_anonymous_page+0x59/0x120
[<c0142f3e>] do_no_page+0x5e/0x280
[<c014338c>] handle_mm_fault+0x12c/0x1b0
[<c0113a76>] do_page_fault+0x176/0x5db
[<c014479d>] vma_merge+0xbd/0x180
[<c0107cd1>] old_mmap+0xc1/0x100
[<c0113900>] do_page_fault+0x0/0x5db
[<c0102f8b>] error_code+0x2b/0x30
Code: 00 00 00 8b 44 24 08 83 c4 14 5b 5e 5f 5d c3 8b 54 24 10 8b 44
24 08 e8 c7 f6 ff ff eb e5 0f 0b 5e 02 ed f4 2f c0 e9 7c ff ff ff <0f>
0b 87 02 ed f4 2f c0 e9 09 ff ff ff 9c 5f fa 8b 54 24 10 89
<1>Unable to handle kernel paging request at virtual address 00100104
printing eip:
c0137927
*pde = 00000000
Oops: 0002 [#2]
Modules linked in:
CPU: 0
EIP: 0060:[<c0137927>] Not tainted VLI
EFLAGS: 00000097 (2.6.11-kgdb)


error 2: main error: For rmap.c:

------------[ cut here ]------------
kernel BUG at mm/rmap.c:487!
invalid operand: 0000 [#1]
Modules linked in:
CPU: 0
EIP: 0060:[<c0147b5d>] Not tainted VLI
EFLAGS: 00000292 (2.6.11-kgdb)
EIP is at page_remove_rmap+0x3d/0x70
eax: ffffffe0 ebx: c10fc6a0 ecx: 00000000 edx: c7d9de74
esi: c7d78124 edi: 00010000 ebp: c10fc6a0 esp: c7d9de70
ds: 007b es: 007b ss: 0068
Process ls (pid: 302, threadinfo=c7d9c000 task=c1273020)
Stack: c02ffb8f 00001000 c0141442 c0300460 c7ddcfa4 00000000 c12f116c 07e35025
00000000 08048000 c03bda38 08448000 c7dde084 08058000 c03bda38 c01415ab
00010000 00000000 c7ddeb7c b83e9000 08048000 c7dde084 08058000 c03bda38
Call Trace:
[<c0141442>] zap_pte_range+0x162/0x280
[<c01415ab>] zap_pmd_range+0x4b/0x70
[<c014160d>] zap_pud_range+0x3d/0x60
[<c0141694>] unmap_page_range+0x64/0x80
[<c01417b9>] unmap_vmas+0x109/0x220
[<c0145e87>] exit_mmap+0x67/0x130
[<c0116fde>] mmput+0x1e/0x70
[<c011ab99>] do_exit+0x99/0x360
[<c011d2d3>] tasklet_action+0x43/0x70
[<c011d099>] __do_softirq+0x79/0x90
[<c011aecf>] do_group_exit+0x2f/0x70
[<c0102de3>] syscall_call+0x7/0xb
Code: 08 ff 0f 98 c0 84 c0 74 3a 8b 43 08 40 78 26 8b 43 08 40 78 16
8b 5c 24 04 ba ff ff ff ff b8 10 00 00 00 83 c4 08 e9 43 08 ff ff
<0f> 0b e7 01 7c fb 2f c0 eb e0 c7 04 24 8f fb 2f c0 e8 6d 14 fd
<6>note: ls[302] exited with preempt_count 1

------------[ cut here ]------------
kernel BUG at mm/rmap.c:487!
invalid operand: 0000 [#1]
Modules linked in:
CPU: 0
EIP: 0060:[<c0147b3d>] Not tainted VLI
EFLAGS: 00000296 (2.6.11-kgdb)
EIP is at page_remove_rmap+0x3d/0x70
eax: ffffffa0 ebx: c10fb680 ecx: 00000000 edx: c7dc9e88
esi: c7d37fd0 edi: 00021000 ebp: c10fb680 esp: c7dc9e84
ds: 007b es: 007b ss: 0068
Process nbench (pid: 301, threadinfo=c7dc8000 task=c1273020)
Stack: c02ffb6f 00016000 c0141436 00000246 00000100 00000000 c7dc9ef8 07db4067
00000000 b7fde000 c03bda38 b83de000 c7dcfb80 b7fff000 c03bda38 c014159b
00021000 00000000 c01205f0 00000100 b7fde000 c7dcfb80 b7fff000 c03bda38
Call Trace:
[<c0141436>] zap_pte_range+0x156/0x270
[<c014159b>] zap_pmd_range+0x4b/0x70
[<c01205f0>] cascade+0x30/0x50
[<c01415fd>] zap_pud_range+0x3d/0x60
[<c0106bb9>] timer_interrupt+0x49/0xe0
[<c0141684>] unmap_page_range+0x64/0x80
[<c01417a9>] unmap_vmas+0x109/0x220
[<c01457da>] unmap_region+0x6a/0xd0
[<c0145a91>] do_munmap+0xf1/0x130
[<c0145b10>] sys_munmap+0x40/0x70
[<c0102de3>] syscall_call+0x7/0xb
Code: 08 ff 0f 98 c0 84 c0 74 3a 8b 43 08 40 78 26 8b 43 08 40 78 16
8b 5c 24 04 ba ff ff ff ff b8 10 00 00 00 83 c4 08 e9 63 08 ff ff <0f>
0b e7 01 5c fb 2f c0 eb e0 c7 04 24 6f fb 2f c0 e8 8d 14 fd
<6>note: nbench[301] exited with preempt_count 1
scheduling while atomic: nbench/0x00000001/301
[<c02e9038>] __switch_to_end+0x20c/0x214
[<c01046eb>] do_IRQ+0x3b/0x70
[<c0102f52>] common_interrupt+0x1a/0x20
[<c02e9acd>] rwsem_down_read_failed+0x8d/0x170
[<c011bff0>] .text.lock.exit+0x27/0x87
[<c011ab99>] do_exit+0x99/0x360
[<c0103678>] die+0x138/0x140
[<c01039d0>] do_invalid_op+0x0/0xc0
[<c0103a6f>] do_invalid_op+0x9f/0xc0
[<c0120ad6>] update_process_times+0x166/0x210
[<c0147b3d>] page_remove_rmap+0x3d/0x70
[<c0132a79>] handle_IRQ_event+0x29/0x60
[<c011d099>] __do_softirq+0x79/0x90
[<c01046eb>] do_IRQ+0x3b/0x70
[<c0102f52>] common_interrupt+0x1a/0x20
[<c0102f8b>] error_code+0x2b/0x30
[<c0147b3d>] page_remove_rmap+0x3d/0x70
[<c0141436>] zap_pte_range+0x156/0x270
[<c014159b>] zap_pmd_range+0x4b/0x70
[<c01205f0>] cascade+0x30/0x50
[<c01415fd>] zap_pud_range+0x3d/0x60
[<c0106bb9>] timer_interrupt+0x49/0xe0
[<c0141684>] unmap_page_range+0x64/0x80
[<c01417a9>] unmap_vmas+0x109/0x220
[<c01457da>] unmap_region+0x6a/0xd0
[<c0145a91>] do_munmap+0xf1/0x130
[<c0145b10>] sys_munmap+0x40/0x70
[<c0102de3>] syscall_call+0x7/0xb


> Why dont you post the code (in case its GPL)...

The patch is here.

https://wiki.planet-lab.org/twiki/pub/Planetlab/UIUCProject/mrc2.6.11.6patch.diff.txt.txt

It's for the 2.6.11.6 kernel. To replicate the error fully before
compiling: please comment the mrc_reset_pgtables(p); in mm/mrc.c on
line 433. If one comments out mrc_scan as well then there's no run
time crashes at all. So all of the errors comes from mrc_scan and
mrc_reset_pgtables, it's just I can't figure out how it's corrupting
the pages, and how I should account for that in the code.

Also, in make xconfig a menu should come up. Go to "Kernel hacking"
then uncheck "Sleep-inside-spinlock checking" I usually use the
virtual emulator qemu when debugging. Finally when running mrc. You
want to start a program and get the process id then do the following:

echo "<process id> 1 1 100 100" > /proc/sys/vm/mrc_info This will
turn mrc on for the given process.

All MRC code is surrounded by //MRC or something like it for easy
recognition. I usually use qemu if you use qemu you may want to
download this:

http://fabrice.bellard.free.fr/qemu/linux-test-0.5.1.tar.gz
http://fabrice.bellard.free.fr/qemu/qemu-0.7.0.tar.gz

and when running qemu.sh in linux-test change the script to
"qemu -nographic -hda linux.img -kernel
/boot/vmlinuz-2.6.11-kgdb -append "console=ttyS0 root=/dev/hda
sb=0x220,5,1,5 ide2=noprobe ide3=noprobe ide4=noprobe
ide5=noprobe"

I'd appreciate any direction you can provide. Thanks for your help in
advance. I look forward to hearing from you.

Zao
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/