Re: 2.0.36 oopses: clear_inode->grow_files->file_table_init->

Paul Wouters (paul@xtdnet.nl)
Mon, 22 Feb 1999 13:59:24 +0100 (MET)


On Fri, 12 Feb 1999, Paul Wouters wrote:

To follow up on my own post, the oopses were solved. They were the result of a
broken CPU fan and an overheated Pentium chip :(

With a new fan installed, the oopses have disappeared.

Paul

> Date: Fri, 12 Feb 1999 23:42:14 +0100 (MET)
> From: Paul Wouters <paul@xtdnet.nl>
> To: linux-kernel@vger.rutgers.edu
> Subject: 2.0.36 oopses: clear_inode->grow_files->file_table_init->
die_if_kernel
>
> On one machine we're getting more and more oopses. I'm suspecting failing
> hardware now. Anyone can comment on it? It's a standard machine, stock
> 2.0.36 (oopses started while running 2.0.30, we then upgraded to 2.0.36),
> 3 network cards, and ide disk, pentium. Pretty standard machine.
>
> None of the oopses were fatal (so far!) but some processes died (cron, sshd,
> sendmails). Good luck this thing also had a restricted telnetd running :)
>
> Can anyone confirm (or reject) that this is caused by a dying harddisk?
> I'm very happy then despite this, the machine is still running. Good job
> there people :)
>
> Paul
>
> general protection: 0000
> CPU: 0
> EIP: 0010:[<00123792>]
> EFLAGS: 00010206
> eax: fffffc00 ebx: 015e3894 ecx: 00326c0c edx: 0000000a
> esi: 015e3810 edi: 00000000 ebp: 00000000 esp: 00457f94
> ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
> Process sendmail (pid: 6195, process nr: 38, stackpage=00457000)
> Stack: bfffe7cc 00000000 000001a4 001237e8 00326c0c 00000000 bfffe944 bfffe70c
> 01c52d00 0010a915 bfffe7cc 00000000 000001a4 00000000 bfffe944 bfffe70c
> ffffffda 0000002b 0000002b 0000002b 0000002b 00000005 4007466c 00000023
> Call Trace: [<001237e8>] [<0010a915>]
> Code: 39 91 d8 01 00 00 7f 0a b8 e8 ff ff ff 5b 5e 5f c3 90 0f ab
>
>
> /usr/local/sbin/ksymoops System.map < crash3
> Using `System.map' to map addresses to symbols.
>
> >>EIP: 123792 <clear_inode+22/110>
> Trace: 1237e8 <clear_inode+78/110>
> Trace: 10a915 <die_if_kernel+105/2e0>
> Code: 123792 <clear_inode+22/110>
> Code: 123792 <clear_inode+22/110> 39 91 d8 01 00 cmpl %edx,0x1d8(%ecx)
> Code: 123798 <clear_inode+28/110> 7f 0a jg 1237a4 <clear_inode+34/110>
> Code: 12379a <clear_inode+2a/110> b8 e8 ff ff ff movl $0xffffffe8,%eax
> Code: 1237a5 <clear_inode+35/110> 5b popl %ebx
> Code: 1237a6 <clear_inode+36/110> 5e popl %esi
> Code: 1237a7 <clear_inode+37/110> 5f popl %edi
> Code: 1237a8 <clear_inode+38/110> c3 ret
> Code: 1237a9 <clear_inode+39/110> 90 nop
> Code: 1237aa <clear_inode+3a/110> 0f ab 00 btsl %eax,(%eax)
> Code: 1237b3 <clear_inode+43/110> 90 nop
> Code: 1237b4 <clear_inode+44/110> 90 nop
> Code: 1237b5 <clear_inode+45/110> 90 nop
>
> This one happened about an hour later:
>
> general protection: 0000
> CPU: 0
> EIP: 0010:[<00123792>]
> EFLAGS: 00010206
> eax: fffffc00 ebx: 015e3894 ecx: 008fcc0c edx: 0000000a
> esi: 015e3810 edi: 00000000 ebp: 00000000 esp: 004b2f94
> ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
> Process sendmail (pid: 6253, process nr: 38, stackpage=004b2000)
> Stack: bfffe7cc 00000000 000001a4 001237e8 008fcc0c 00000000 bfffe944 bfffe70c
> 01c52d00 0010a915 bfffe7cc 00000000 000001a4 00000000 bfffe944 bfffe70c
> ffffffda 0000002b 0000002b 0000002b 0000002b 00000005 4007466c 00000023
> Call Trace: [<001237e8>] [<0010a915>]
> Code: 39 91 d8 01 00 00 7f 0a b8 e8 ff ff ff 5b 5e 5f c3 90 0f ab
>
> Giving:
>
> >>EIP: 123792 <clear_inode+22/110>
> Trace: 1237e8 <clear_inode+78/110>
> Trace: 10a915 <die_if_kernel+105/2e0>
> Code: 123792 <clear_inode+22/110>
> Code: 123792 <clear_inode+22/110> 39 91 d8 01 00 cmpl %edx,0x1d8(%ecx)
> Code: 123798 <clear_inode+28/110> 7f 0a jg 1237a4 <clear_inode+34/110>
> Code: 12379a <clear_inode+2a/110> b8 e8 ff ff ff movl $0xffffffe8,%eax
> Code: 1237a5 <clear_inode+35/110> 5b popl %ebx
> Code: 1237a6 <clear_inode+36/110> 5e popl %esi
> Code: 1237a7 <clear_inode+37/110> 5f popl %edi
> Code: 1237a8 <clear_inode+38/110> c3 ret
> Code: 1237a9 <clear_inode+39/110> 90 nop
> Code: 1237aa <clear_inode+3a/110> 0f ab 00 btsl %eax,(%eax)
> Code: 1237b3 <clear_inode+43/110> 90 nop
> Code: 1237b4 <clear_inode+44/110> 90 nop
> Code: 1237b5 <clear_inode+45/110> 90 nop
>
> Shortly followed by another one:
> CPU: 0
> EIP: 0010:[<0012499c>]
> EFLAGS: 00010282
> eax: 018efc00 ebx: fffffffe ecx: 00000000 edx: 00000000
> esi: 00000001 edi: 00008000 ebp: 01875f8c esp: 01875f3c
> ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
> Process sshd (pid: 273, process nr: 12, stackpage=01875000)
> Stack: 0012c2da 018efc00 019044c8 00000000 01268000 00000005 00000000 018efc00
> 01268005 00000006 00123689 01268000 00000001 00000000 01875f8c 00000000
> 00000000 00000005 00000000 00000000 00000000 0012380f 01268000 00000000
> Call Trace: [<0012c2da>] [<00123689>] [<0012380f>] [<0010a915>]
> Code: 53 8b 5c 24 08 85 db 0f 84 81 01 00 00 80 bb 88 00 00 00 00
>
> Giving:
>
> >>EIP: 12499c <grow_files+8c/90>
> Trace: 12c2da <connect_select+a/1b0>
> Trace: 123689 <insert_inode_hash+29/40>
> Trace: 12380f <clear_inode+9f/110>
> Trace: 10a915 <die_if_kernel+105/2e0>
> Code: 12499c <grow_files+8c/90>
> Code: 12499c <grow_files+8c/90> 53 pushl %ebx
> Code: 12499d <grow_files+8d/90> 8b 5c 24 08 movl 0x8(%esp,1),%ebx
> Code: 1249a1 <file_table_init+1/10> 85 db testl %ebx,%ebx
> Code: 1249a3 <file_table_init+3/10> 0f 84 81 01 00 je 124b2a <__wait_on_buffer+8a/100>
> Code: 1249af <file_table_init+f/10> 80 bb 88 00 00 cmpb $0x0,0x88(%ebx)
>
> 30 seconds later, things got pretty bad :)
>
> Unable to handle kernel paging request at virtual address 01bc2ff0
> current->tss.cr3 = 01bc1000,
> *pde = 00000000
> Oops: 0000
> CPU: 0
> EIP: 0010:[<0010a965>]
> EFLAGS: 00010246
> eax: 00000246 ebx: 01bc3018 ecx: bffffda0 edx: ffff0ff0
> esi: 36c48571 edi: bffffd90 ebp: bffffd60 esp: 01bc2fbc
> ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
> Process crond (pid: 48, process nr: 11, stackpage=01bc2000)
> Stack: 00000000 bffffda0 bffffd90 36c48571 bffffd90 bffffd60 36c48576 0000002b
> 0000002b 0000002b 0000002b 0000000d 40075182 00000023 00000246 bffffd5c
> 0000002b
> Call Trace:
> Code: 66 83 7c 24 34 10 74 35 fb 0d 00 02 00 00 25 ff bf ff ff 89
>
> Giving:
>
> >>EIP: 10a965 <die_if_kernel+155/2e0>
> Code: 10a965 <die_if_kernel+155/2e0>
> Code: 10a965 <die_if_kernel+155/2e0> 66 83 7c 24 34 cmpw $0x10,0x34(%esp,1)
> Code: 10a96b <die_if_kernel+15b/2e0> 74 35 je 10a9a2 <die_if_kernel+192/2e0>
> Code: 10a96d <die_if_kernel+15d/2e0> fb sti
> Code: 10a96e <die_if_kernel+15e/2e0> 0d 00 02 00 00 orl $0x200,%eax
> Code: 10a979 <die_if_kernel+169/2e0> 25 ff bf ff ff andl $0xffffbfff,%eax
> Code: 10a97e <die_if_kernel+16e/2e0> 89 00 movl %eax,(%eax)
> Code: 10a986 <die_if_kernel+176/2e0> 90 nop
> Code: 10a987 <die_if_kernel+177/2e0> 90 nop
> Code: 10a988 <die_if_kernel+178/2e0> 90 nop
>
> And the last one:
>
> general protection: 0000
> CPU: 0
> EIP: 0010:[<00123799>]
> EFLAGS: 00010216
> eax: ffffff80 ebx: 01ab1894 ecx: 01911810 edx: 00000007
> esi: 01ab1810 edi: bfffe7d8 ebp: 00000000 esp: 008fcf98
> ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
> Process sendmail (pid: 6259, process nr: 11, stackpage=008fc000)
> Stack: 080c96c8 000001a4 001237e8 01911810 080c96c8 bfffe724 bfffe6c4 008fcfbc
> 0010a915 bfffe7d8 00000000 000001a4 080c96c8 bfffe724 bfffe6c4 ffffffda
> 0000002b 0000002b 0000002b 0000002b 00000005 4007466c 00000023 00000202
> Call Trace: [<001237e8>] [<0010a915>]
> Code: 0a b8 e8 ff ff ff 5b 5e 5f c3 90 0f ab 96 84 00 00 00 0f b3
>
> Giving another clear_inode problem:
>
> >>EIP: 123799 <clear_inode+29/110>
> Trace: 1237e8 <clear_inode+78/110>
> Trace: 10a915 <die_if_kernel+105/2e0>
> Code: 123799 <clear_inode+29/110>
> Code: 123799 <clear_inode+29/110> 0a b8 e8 ff ff orb 0xffffffe8(%eax),%bh
> Code: 12379f <clear_inode+2f/110> 5b popl %ebx
> Code: 1237a0 <clear_inode+30/110> 5e popl %esi
> Code: 1237a1 <clear_inode+31/110> 5f popl %edi
> Code: 1237a2 <clear_inode+32/110> c3 ret
> Code: 1237a9 <clear_inode+39/110> 90 nop
> Code: 1237aa <clear_inode+3a/110> 0f ab 96 84 00 btsl %edx,0x84(%esi)
> Code: 1237b1 <clear_inode+41/110> 0f b3 00 btrl %eax,(%eax)
> Code: 1237ba <clear_inode+4a/110> 90 nop
> Code: 1237bb <clear_inode+4b/110> 90 nop
> Code: 1237bc <clear_inode+4c/110> 90 nop
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>

-- 
Paul Wouters                    Postbus 170            Tel: 31-24-360 39 19
Xtended Internet                6500 AD Nijmegen       Fax: 31-24-360 19 99
info@xtdnet.nl                  The Netherlands        http://www.xtdnet.nl/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/