2.6.29.4: softlockup at find_get_page() et al
From: Alexey Dobriyan
Date: Tue Jun 16 2009 - 07:01:10 EST
Happened during overnight run when box was cross-compiling kernel slowly
(only -j7).
Example messages:
[67287.109985] BUG: soft lockup - CPU#0 stuck for 61s! [conf:3980]
[67287.110001] CPU 0:
[67287.110001] Pid: 3980, comm: conf Not tainted 2.6.29.4-x86_64 #1 P5E
[67287.110001] RIP: 0010:[<ffffffff80264902>] [<ffffffff80264902>] find_get_page+0x52/0xb0
[67287.110001] RSP: 0018:ffff8801001eddd8 EFLAGS: 00000246
[67287.110001] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000034
[67287.110001] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffe200010a9f40
[67287.110001] RBP: ffffffff8020c3ee R08: ffffe200010a9f48 R09: ffff8800aa798c28
[67287.110001] R10: ffffe200010a9f40 R11: ffffffff802f9480 R12: ffff8800aa798c18
[67287.110001] R13: ffffffff8020c3ee R14: ffffffff802f4e70 R15: 000000000000000c
[67287.110001] FS: 0000000000000000(0000) GS:ffffffff8075b040(0063) knlGS:00000000557116c0
[67287.110001] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[67287.110001] CR2: 00000000556d9374 CR3: 000000012faa4000 CR4: 00000000000006e0
[67287.110001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[67287.110001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[67287.110001] Call Trace:
[67287.110001] [<ffffffff802648cb>] ? find_get_page+0x1b/0xb0
[67287.110001] [<ffffffff80264bf3>] ? find_lock_page+0x23/0x80
[67287.110001] [<ffffffff80265631>] ? find_or_create_page+0x41/0xc0
[67287.110001] [<ffffffff802f4edd>] ? ext2_make_empty+0x2d/0x1f0
[67287.110001] [<ffffffff802f9556>] ? ext2_mkdir+0xd6/0x170
[67287.110001] [<ffffffff8029b37c>] ? sys_mkdirat+0x11c/0x130
[67287.110001] [<ffffffff802a585a>] ? alloc_fd+0x4a/0x140
[67287.110001] [<ffffffff8022ab14>] ? sysenter_dispatch+0x7/0x2b
[67352.609983] BUG: soft lockup - CPU#0 stuck for 61s! [conf:3980]
[67352.610001] CPU 0:
[67352.610001] Pid: 3980, comm: conf Not tainted 2.6.29.4-x86_64 #1 P5E
[67352.610001] RIP: 0010:[<ffffffff8026dfbe>] [<ffffffff8026dfbe>] put_page+0x2e/0x170
[67352.610001] RSP: 0018:ffff8801001eddc8 EFLAGS: 00000202
[67352.610001] RAX: ffffe200010a9f48 RBX: ffff8800aa798c18 RCX: 0000000000000034
[67352.610001] RDX: 0000000000000000 RSI: ffffe200010a9f40 RDI: ffffe200010a9f40
[67352.610001] RBP: ffffffff8020c3ee R08: fa00000000000000 R09: 8000000000000000
[67352.610001] R10: ffffe200010a9f40 R11: ffffffff802f9480 R12: ffffffff802f4e70
[67352.610001] R13: 000000000000000c R14: ffffffff80299de9 R15: 0000000100000241
[67352.610001] FS: 0000000000000000(0000) GS:ffffffff8075b040(0063) knlGS:00000000557116c0
[67352.610001] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[67352.610001] CR2: 00000000556d9374 CR3: 000000012faa4000 CR4: 00000000000006e0
[67352.610001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[67352.610001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[67352.610001] Call Trace:
[67352.610001] [<ffffffff80264c42>] ? find_lock_page+0x72/0x80
[67352.610001] [<ffffffff80265631>] ? find_or_create_page+0x41/0xc0
[67352.610001] [<ffffffff802f4edd>] ? ext2_make_empty+0x2d/0x1f0
[67352.610001] [<ffffffff802f9556>] ? ext2_mkdir+0xd6/0x170
[67352.610001] [<ffffffff8029b37c>] ? sys_mkdirat+0x11c/0x130
[67352.610001] [<ffffffff802a585a>] ? alloc_fd+0x4a/0x140
[67352.610001] [<ffffffff8022ab14>] ? sysenter_dispatch+0x7/0x2b
[67418.109983] BUG: soft lockup - CPU#0 stuck for 61s! [conf:3980]
[67418.110001] CPU 0:
[67418.110001] Pid: 3980, comm: conf Not tainted 2.6.29.4-x86_64 #1 P5E
[67418.110001] RIP: 0010:[<ffffffff8026492e>] [<ffffffff8026492e>] find_get_page+0x7e/0xb0
[67418.110001] RSP: 0018:ffff8801001eddd8 EFLAGS: 00000246
[67418.110001] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001
[67418.110001] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffe200010a9f40
[67418.110001] RBP: ffffffff8020c3ee R08: ffffe200010a9f48 R09: ffff8800aa798c28
[67418.110001] R10: ffffe200010a9f40 R11: ffffffff802f9480 R12: ffff8800aa798c18
[67418.110001] R13: ffffffff8020c3ee R14: ffffffff802f4e70 R15: 000000000000000c
[67418.110001] FS: 0000000000000000(0000) GS:ffffffff8075b040(0063) knlGS:00000000557116c0
[67418.110001] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[67418.110001] CR2: 00000000556d9374 CR3: 000000012faa4000 CR4: 00000000000006e0
[67418.110001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[67418.110001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[67418.110001] Call Trace:
[67418.110001] [<ffffffff802648cb>] ? find_get_page+0x1b/0xb0
[67418.110001] [<ffffffff80264bf3>] ? find_lock_page+0x23/0x80
[67418.110001] [<ffffffff80265631>] ? find_or_create_page+0x41/0xc0
[67418.110001] [<ffffffff802f4edd>] ? ext2_make_empty+0x2d/0x1f0
[67418.110001] [<ffffffff802f9556>] ? ext2_mkdir+0xd6/0x170
[67418.110001] [<ffffffff8029b37c>] ? sys_mkdirat+0x11c/0x130
[67418.110001] [<ffffffff802a585a>] ? alloc_fd+0x4a/0x140
[67418.110001] [<ffffffff8022ab14>] ? sysenter_dispatch+0x7/0x2b
...
Then box became unusable:
[90276.999985] BUG: soft lockup - CPU#0 stuck for 61s! [conf:3980]
[90277.000002] CPU 0:
[90277.000002] Pid: 3980, comm: conf Not tainted 2.6.29.4-x86_64 #1 P5E
[90277.000002] RIP: 0010:[<ffffffff80264902>] [<ffffffff80264902>] find_get_page+0x52/0xb0
[90277.000002] RSP: 0018:ffff8801001eddd8 EFLAGS: 00000246
[90277.000002] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000034
[90277.000002] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffe200010a9f40
[90277.000002] RBP: ffffffff8020c3ee R08: ffffe200010a9f48 R09: ffff8800aa798c28
[90277.000002] R10: ffffe200010a9f40 R11: ffffffff802f9480 R12: ffff8800aa798c18
[90277.000002] R13: ffffffff8020c28e R14: ffffffff802f4e70 R15: 000000000000000c
[90277.000002] FS: 0000000000000000(0000) GS:ffffffff8075b040(0063) knlGS:00000000557116c0
[90277.000002] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[90277.000002] CR2: 00000000556d9374 CR3: 000000012faa4000 CR4: 00000000000006e0
[90277.000002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[90277.000002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[90277.000002] Call Trace:
[90277.000002] [<ffffffff802648cb>] ? find_get_page+0x1b/0xb0
[90277.000002] [<ffffffff80264bf3>] ? find_lock_page+0x23/0x80
[90277.000002] [<ffffffff80265631>] ? find_or_create_page+0x41/0xc0
[90277.000002] [<ffffffff802f4edd>] ? ext2_make_empty+0x2d/0x1f0
[90277.000002] [<ffffffff802f9556>] ? ext2_mkdir+0xd6/0x170
[90277.000002] [<ffffffff8029b37c>] ? sys_mkdirat+0x11c/0x130
[90277.000002] [<ffffffff802a585a>] ? alloc_fd+0x4a/0x140
[90277.000002] [<ffffffff8022ab14>] ? sysenter_dispatch+0x7/0x2b
[90279.364317] nf_conntrack: table full, dropping packet.
[90282.362418] nf_conntrack: table full, dropping packet.
...
FWIW, userpace is 32-bit, ext2 is used to host source tree, build result
and ccache.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/