Re: did you seen a process stuck in D state?

Tom Holroyd (tomh@taz.ccs.fau.edu)
Wed, 27 Jan 1999 12:57:16 +0900 (JST)


On Wed, 27 Jan 1999, Andrea Arcangeli wrote:

>After 20 sec of deadlock you should see an Oops on the screen. Please
>filter the oops throught ksymoops and give us the kernel trace of the
>deadlocked process.

Did some postmark's and dd if=/dev/zero of=/tmp/moo count=50000
as preparation to swap a bunch of stuff and mix the buffers up
well. Running vmstat 2 and top. Then make MAKE="make -j5" dep and voila:

Options used: -V (default)
-o /lib/modules/2.2.0/ (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-m /usr/src/linux/System.map (default)
-c 1 (default)

No modules in ksyms, skipping objects
Jan 27 12:40:12 bhalpha1 kernel: Unable to handle kernel paging request at virtual address 0000000000000000
Jan 27 12:40:12 bhalpha1 kernel: top(346): Oops 1
Jan 27 12:40:12 bhalpha1 kernel: pc = [__down+192/416] ra = [__down+180/416] ps = 0000
Jan 27 12:40:12 bhalpha1 kernel: r0 = 0000000000444370 r1 = 0000000000000001 r2 = 00000000000589bb
Jan 27 12:40:12 bhalpha1 kernel: r3 = fffffc0000077ee0 r4 = fffffc00055e8000 r5 = fffffc00004929f8
Jan 27 12:40:12 bhalpha1 kernel: r6 = fffffc0000444000 r7 = 0000000000000263 r8 = fffffc00055e8000
Jan 27 12:40:12 bhalpha1 kernel: r9 = 0000000000000000 r10= fffffc0003ba2b88 r11= fffffc00055e8000
Jan 27 12:40:12 bhalpha1 kernel: r12= 0000000000000002 r13= 0000000000002694 r14= 00000000000002f9
Jan 27 12:40:12 bhalpha1 kernel: r15= fffffc00055ebbf8
Jan 27 12:40:12 bhalpha1 kernel: r16= 0000000000000000 r17= fffffc0000077ee0 r18= fffffc0000349dac
Jan 27 12:40:12 bhalpha1 kernel: r19= 0000000000000000 r20= 0000000000000001 r21= fffffc0000000000
Jan 27 12:40:12 bhalpha1 kernel: r22= 0000000000000001 r23= 0000000000204080 r24= 7600000000000000
Jan 27 12:40:12 bhalpha1 kernel: r25= 0000000000000001 r27= fffffc0000310ac0 r28= 0000000000002852
Jan 27 12:40:12 bhalpha1 kernel: gp = fffffc000048f680 sp = fffffc00055ebbf8
Jan 27 12:40:12 bhalpha1 kernel: Code: a02a0000 402d05a1 e4200001 <b3e90000> b58b0000 a46a0010 47ff0404 a84a0004 ec400003

Code: fffffffffffffff4 <END_OF_CODE+3ffffb3ea90/????> 0000000000000000 <_EIP>:
Code: fffffffffffffff4 <END_OF_CODE+3ffffb3ea90/????> 0: a0 2a 00 00 call_pal 0x2aa0
Code: fffffffffffffff8 <END_OF_CODE+3ffffb3ea94/????> 4: 40 2d 05 a1 ldl t7,11584(t4)
Code: fffffffffffffffc <END_OF_CODE+3ffffb3ea98/????> 8: e4 20 00 01 call_pal 0x10020e4
Code: 0000000000000000 Before first symbol c: b3 e9 00 00 call_pal 0xe9b3
Code: 0000000000000004 Before first symbol 10: b5 8b 00 00 call_pal 0x8bb5
Code: 0000000000000008 Before first symbol 14: a4 6a 00 10 .long 0x10006aa4
Code: 000000000000000c Before first symbol 18: 47 ff 04 04 .long 0x404ff47
Code: 0000000000000010 Before first symbol 1c: a8 4a 00 04 .long 0x4004aa8
Code: 0000000000000014 Before first symbol 20: ec 40 00 03 call_pal 0x30040ec

Jan 27 12:40:12 bhalpha1 kernel: Trace:
[__down_failed+100/184]
[do_bottom_half+124/192]
[generate_oops+0/64]
[do_exit+952/992]
[sprintf+4560/10652]
[get_process_array+96/160]
[get_stat+56/928]
[d_alloc+64/448]
[proc_root_lookup+140/480]
[cached_lookup+28 /160]
[do_follow_link+184/224]
[lookup_dentry+524/704]
[get_process_array+104/160]
[array_read+268/608]
[sys_read+256/320]
[entSys+168/192]

2 warnings issued. Results may not be reliable.

This is an Alpha architecture. I have CONFIG_RTC on, too (don't know if
that's relevant but I can't ctrl-alt-del this box or it hangs in
callibrating bogomips -- need to halt and reset; fine without RTC).

The WCHAN for top is apparently bogus? It said "end" (at least on one
try -- didn't notice for this particular one).

No FS damage this time! ^_^;

Dr. Tom Holroyd
I would dance and be merry,
Life would be a ding-a-derry,
If I only had a brain.
-- The Scarecrow

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/