Oops/panic in 2.2.5 FAT (readdir?)

Harald Koenig (koenig@tat.physik.uni-tuebingen.de)
Thu, 8 Apr 1999 18:42:06 +0200


this oops below happend when find (from updatedb) tried to read
a FAT partition (mounted as VFAT) which was corrupted. this was
100% reproduceable, but first I didn't think of a fs/kernel problem
because this showed up exactly after replacing memory and CPU in that PC,
but slowing down and playing with hardware didn't help.

so I tried running chkdsk.exe, and it found a block crosslinked
between a file and a directory. after fixing this crosslink with chkdsk
the oops was gone. unfortuneately I don't have a copy of that broken
partition, so now I can't reproduce this bug anymore :(

this oops caused some more oops within a few seconds, finally causing
a kernel panic every time...

here are two different Oops, with some additional syslog output:

-------------------------------------------------------------------------------
attempt to access beyond end of device
03:01: rw=0, want=104534, limit=102784
attempt to access beyond end of device
03:01: rw=0, want=104535, limit=102784
...
Filesystem panic (dev 03:01).
FAT error
File system has been set read-only
Directory 282815: bad FAT
...
03:01: rw=0, want=112444, limit=102784
Directory sread (sector 0x36e78) failed
...
03:01: rw=0, want=112445, limit=102784
Directory sread (sector 0x36e79) failed
...
03:01: rw=0, want=112445, limit=102784
Directory sread (sector 0x36e7a) failed
...
attempt to access beyond end of device
03:01: rw=0, want=123481, limit=102784
Filesystem panic (dev 03:01).
FAT error
Directory 282819: bad FAT
FAT: fat_truncate called though fs is read-only, uhh...
FAT: fat_truncate called though fs is read-only, uhh...
...
Unable to handle kernel paging request at virtual address 3f3f3f53
current->tss.cr3 = 0073b000, %cr3 = 0073b000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0146b04>]
EFLAGS: 00010202
eax: 00000001 ebx: 3f3f3f3f ecx: 001cd160 edx: 0000000c
esi: c073de30 edi: c01fd4a6 ebp: 00295b7c esp: c073dddc
ds: 0018 es: 0018 ss: 0018
Process find (pid: 372, process nr: 35, stackpage=c073d000)
Stack: bfffdd8c 000007af 00000000 00000000 0000000c 0000000c c073de00 c073de30
00000001 00000000 00000000 c18107a4 c0510000 00000000 00000000 c0007d00
00000025 c0269a00 00000000 001cd160 c000e200 7c1634a8 7be47522 6127022e
Call Trace: [<c18107a4>]
Code: 8b 4b 14 83 c1 e0 eb 05 c6 44 24 44 00 55 0f b6 54 24 48 52

>>EIP: c0146b04 <fat_readdirx+59c/720>
Trace: c18107a4 <cleanup_module+718/1fc4>
Code: c0146b04 <fat_readdirx+59c/720> 00000000 <_EIP>: <===
Code: c0146b04 <fat_readdirx+59c/720> 0: 8b 4b 14 movl 0x14(%ebx),%ecx <===
Code: c0146b07 <fat_readdirx+59f/720> 3: 83 c1 e0 addl $0xffffffe0,%ecx
Code: c0146b0a <fat_readdirx+5a2/720> 6: eb 05 jmp c0146b11 <fat_readdirx+5a9/720>
Code: c0146b0c <fat_readdirx+5a4/720> 8: c6 44 24 44 00 movb $0x0,0x44(%esp,1)
Code: c0146b11 <fat_readdirx+5a9/720> d: 55 pushl %ebp
Code: c0146b12 <fat_readdirx+5aa/720> e: 0f b6 54 24 48 movzbl 0x48(%esp,1),%edx
Code: c0146b17 <fat_readdirx+5af/720> 13: 52 pushl %edx

Unable to handle kernel paging request at virtual address 4e14b00d
current->tss.cr3 = 01227000, %cr3 = 01227000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c012d069>]
EFLAGS: 00010202
eax: c0087000 ebx: 00000000 ecx: c121ffa8 edx: 4e14b001
esi: c121e000 edi: 00000005 ebp: 00000000 esp: c121ff54
ds: 0018 es: 0018 ss: 0018
Process rpc.mountd (pid: 95, process nr: 8, stackpage=c121f000)
Stack: c0004700 00000000 00000020 00000145 c121e000 7fffffff 00000000 c0ded000
c012d523 00000007 c121ffa8 c121ffa4 c121e000 00000000 00000000 bffffd64
bffffbe0 c121ffa8 00000280 c0004400 7fffffff c0004400 c0004480 c0004500
Call Trace: [<c012d523>] [<c0108a78>]
Code: 8b 42 0c 85 c0 74 15 8b 40 10 85 c0 74 0e ff 74 24 20 52 ff

>>EIP: c012d069 <do_select+10d/214>
Trace: c012d523 <sys_select+3b3/4d8>
Trace: c0108a78 <system_call+34/38>
Code: c012d069 <do_select+10d/214> 00000000 <_EIP>: <===
Code: c012d069 <do_select+10d/214> 0: 8b 42 0c movl 0xc(%edx),%eax <===
Code: c012d06c <do_select+110/214> 3: 85 c0 testl %eax,%eax
Code: c012d06e <do_select+112/214> 5: 74 15 je c012d085 <do_select+129/214>
Code: c012d070 <do_select+114/214> 7: 8b 40 10 movl 0x10(%eax),%eax
Code: c012d073 <do_select+117/214> a: 85 c0 testl %eax,%eax
Code: c012d075 <do_select+119/214> c: 74 0e je c012d085 <do_select+129/214>
Code: c012d077 <do_select+11b/214> e: ff 74 24 20 pushl 0x20(%esp,1)
Code: c012d07b <do_select+11f/214> 12: 52 pushl %edx
Code: c012d07c <do_select+120/214> 13: ff 00 incl (%eax)
-------------------------------------------------------------------------------
attempt to access beyond end of device
03:01: rw=0, want=117035, limit=102784
Directory sread (sector 0x39256) failed
attempt to access beyond end of device
03:01: rw=0, want=117035, limit=102784
Directory sread (sector 0x39256) failed
Unable to handle kernel NULL pointer dereference at virtual address 00000037
current->tss.cr3 = 0035f000, %cr3 = 0035f000
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0125a79>]
EFLAGS: 00010286
eax: c12e30d0 ebx: c13d4330 ecx: c0228fd8 edx: ffffffff
esi: 00000200 edi: ffffffff ebp: 00020301 esp: c090dddc
ds: 0018 es: 0018 ss: 0018
Process find (pid: 373, process nr: 33, stackpage=c090d000)
Stack: c000e200 0002e225 00000000 c0125daa 00000301 0002e225 00000200 c13d4380
c01459aa 00000301 0002e225 00000200 00000002 0000b81e c000e200 0000b81c
00000301 c0149ab4 c000e200 0002e225 00000002 0000b81e c000e200 0000b81c
Call Trace: [<c0125daa>] [<c01459aa>] [<c0149ab4>] [<c0149dd4>] [<c0149ff9>] [<c0148e48>] [<c014e3ee>]
[<c0130492>] [<c0130578>] [<c014d648>] [<c012addb>] [<c012af8e>] [<c012b062>] [<c01291c1>] [<c0108a78>]
Code: 89 42 38 8b 53 28 39 1c 95 dc 8f 22 c0 75 21 8b 43 1c 89 04

>>EIP: c0125a79 <getblk+9d/214>
Trace: c0125daa <bread+16/6c>
Trace: c01459aa <fat_bread+5e/11c>
Trace: c0149ab4 <raw_scan_sector+1c/274>
Trace: c0149dd4 <raw_scan_nonroot+54/a0>
Trace: c0149ff9 <fat_subdirs+65/70>
Trace: c0148e48 <fat_read_inode+238/474>
Trace: c014e3ee <vfat_read_inode+e/14>
Trace: c0130492 <get_new_inode+96/11c>
Code: c0125a79 <getblk+9d/214> 00000000 <_EIP>: <===
Code: c0125a79 <getblk+9d/214> 0: 89 42 38 movl %eax,0x38(%edx) <===
Code: c0125a7c <getblk+a0/214> 3: 8b 53 28 movl 0x28(%ebx),%edx
Code: c0125a7f <getblk+a3/214> 6: 39 1c 95 dc 8f 22 cmpl %ebx,0xc0228fdc(,%edx,4)
Code: c0125a85 <getblk+a9/214> c: c0
Code: c0125a86 <getblk+aa/214> d: 75 21 jne c0125aa9 <getblk+cd/214>
Code: c0125a88 <getblk+ac/214> f: 8b 43 1c movl 0x1c(%ebx),%eax
Code: c0125a8b <getblk+af/214> 12: 89 04 00 movl %eax,(%eax,%eax,1)

2 warnings issued. Results may not be reliable.
-------------------------------------------------------------------------------

Harald

--
All SCSI disks will from now on                     ___       _____
be required to send an email notice                0--,|    /OOOOOOO\
24 hours prior to complete hardware failure!      <_/  /  /OOOOOOOOOOO\
                                                    \  \/OOOOOOOOOOOOOOO\
                                                      \ OOOOOOOOOOOOOOOOO|//
Harald Koenig,                                         \/\/\/\/\/\/\/\/\/
Inst.f.Theoret.Astrophysik                              //  /     \\  \
koenig@tat.physik.uni-tuebingen.de                     ^^^^^       ^^^^^

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/