Bugreport for 2.2.6

Torsten Mohr (tmohr@stuttgart.netsurf.de)
Mon, 26 Apr 1999 20:36:17 +0200 (MEST)


Hi,

i just updated section 5 of this bugreport, you should find
the output of "insmod -m" of sg.o and cdrom.o, the unprocessed
output of the Oops and the processed output of the Oops.
My System.map is available on my web page:

http://www.stuttgart.netsurf.de/~tmohr/bugreport/bug.html

I'm sorry if my reports are not quite what you expect,
but this is my first bugreport. This part of Linux is
new to me. I hope i included all the relevant informations.

Best regards,
Torsten.

---------------
Here's the output of "ver_linux"

schleim:/usr/src/linux/scripts # sh ver_linux
-- Versions installed: (if some fields are empty or looks
-- unusual then possibly you have very old versions)
Linux schleim 2.2.6 #1 Sun Apr 18 12:54:36 MEST 1999 i586 unknown
Kernel modules 2.1.85
Gnu C 2.7.2.3
Binutils 2.9.1.0.15
Linux C Library x 1 root root 2478585 Jan 14 22:29 /lib/libc.so.6
Dynamic linker ldd (GNU libc) 2.0.7
Procps 1.2.7
Mount 2.9
Net-tools 1.46
Kbd 0.96
Sh-utils 1.12
Modules Loaded ppp slhc nfs lockd sunrpc ne 8390 dummy0 serial unix
schleim:/usr/src/linux/scripts #

----------------------------------------
[1.] One line summary of the problem:

The kernel crashes, when it reads a corrupted sector on an ext2-CD

----------------------------------------
[2.] Full description of the problem/report:

I use a german distribution, SuSE 6.0 and just updated to
kernel 2.2.6.

I created an empty 650MB file, created an ext2 on this file,
mounted it via loopback and tar'ed some relevant directories
in it. Then i toasted this file on a CD.

Now i wanted to restore a file from the home.tgz, so i
mounted my /big, cd'ed into it and typed:

mount -t ext2 /dev/cdrom /cdrom
[*1]
tar tvfz /cdrom/home.tgz >inhalt

The system crashed. I couldn't see the "Oops" because
X11 was up. No Ctrl-Alt-Delete, no pointer reaction.

The file "inhalt" was created on /big, but not all of
it, approximately 600 kB.

Then i hard-resetted my computer, did the same until
point [*1] and typed:

tar xvfz /cdrom/home.tgz home/tmohr/pilot/rom30german.zip

It took a while, so i logged in on another console.
Then suddenly the box crashed with the output given
in "panic".

I think tar looks through the whole archive, does it?
The last file it processed (according to "inhalt")
was a very large file.

tail inhalt:
tmohr@schleim:/big > tail inhalt
-rw------- tmohr/users 7354 1998-09-13 11:14 home/my/pics/inside.gif
-rw-r--r-- tmohr/users 113021 1998-09-23 19:02 home/my/pics/visitenkarte.jpg
drwxr-xr-x root/root 0 1998-11-07 08:59 home/my/midi/
-rw-r--r-- root/root 38548 1998-11-07 08:59 home/my/midi/easy.mid
-rwxr-xr-x root/root 16169 1998-11-07 08:59 home/my/midi/hheureu4.mid
-rw------- root/root 21221 1998-11-07 08:59 home/my/midi/jazz.mid
-rwxr-xr-x root/root 64760 1998-11-07 08:59 home/my/midi/stair_h.mid
-rwxr-xr-x root/root 18978 1998-11-07 08:59 home/my/midi/willkom4.mid
-rw-r--r-- root/root 40213 1998-11-07 08:59 home/my/midi/womaninl.mid
-rw-r--r-- root/root 11076775 1998-05-30 20:59 home/home-iei.tgz

I did all the actions above as "root".

As a user i tried:
tmohr@schleim:/big > cat /cdrom/home.tgz >/dev/null
cat: /cdrom/home.tgz: Eingabe-/Ausgabefehler
tmohr@schleim:/big >

Which is german and says that there is an I/O error.
This makes me think that this isn't a CDROM related bug,
because just reading it makes no error.
It seems, there is a corrupted sector on the CD.

So i tried another backup-CD and typed

tmohr@schleim:/big > tar tvfz /cdrom/home.tgz >inhalt2
This doesn't crash the system.

Then i tried to crash the system as a user by typing
"tar tvfz /cdrom/home.tgz" using the corrupted CD.

This crashed the system again, this time as a
normal user. In the Oops there's exactly the same
"Call Trace", exactly the same "Code".
It seems the problem is related to corrupted CDs.

So i took a knife, another backup-CD and scratched
the CD.
typing "tar tvfz /cdrom/home.tgz >inhalt" crashed
the system. So the problem seems to be reproducable.

My guessing:
The problem seems to be related to:
- CD access
- ext2 access
- corrupted sectors
- tar

I was able to crash the system as a user, so i don't
think "tar" is the problem.

I was able to read the sectors on the CD by typing
"dd if=/dev/cdrom of=/dev/null", so i don't think
that this problem is related to CDROMs alone.

I think when someone writes a file system (like ext2),
it is hard to test the behaviour when there are corrupted
blocks, blocks that are just corrupted.
I wouldn't wonder if that was not tested, because it is
hard to test it.
The problem could also be related to the way the
CDROM driver reports "read errors" to the ext2 system.

----------------------------------------
[3.] Keywords (i.e., modules, networking, kernel):
EXT2, CORRUPT BLOCKS, CDROM, TAR, READ-ERROR

----------------------------------------
[4.] Kernel version (from /proc/version):
schleim:/proc # cat version
Linux version 2.2.6 (root@schleim) (gcc version 2.7.2.3) #1 Sun Apr 18 12:54:36 MEST 1999

----------------------------------------
[5.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/oops-tracing.txt)

I loaded the following modules:
schleim:/lib/modules/2.2.6/scsi # insmod -m sg.o
Sections: Size Address Align
.this 0000004c c484f000 2**2
.text 00002228 c484f04c 2**2
.fixup 00000018 c4851274 2**0
.rodata 000004be c485128c 2**2
__ex_table 00000010 c485174c 2**2
.data 0000007c c485175c 2**2
.kstrtab 0000001b c48517d8 2**0
.bss 00000000 c48517f4 2**2
__ksymtab 00000010 c48517f4 2**2

Symbols:
00000000 a sg.c
c484f000 d __this_module
c484f04c t gcc2_compiled.
c484f04c t .text
c484f04c t sg_open
c484f208 t sg_release
c484f2a4 t sg_read
c484f434 t sg_write
c484f6c8 t sg_ioctl
c484fecc t sg_poll
c484ff50 t sg_fasync
c484ff8c t sg_command_done
c48501d0 t sg_debug_all
c4850298 t sg_debug
c4850478 t sg_detect
c48504e8 t sg_init
c4850584 t sg_attach
c4850620 t sg_finish
c4850624 t sg_detach
c48506c0 t init_module
c48506dc t cleanup_module
c485072c t sg_shorten_timeout
c4850738 t sg_sc_build
c4850a60 t sg_sc_undo_rem
c4850b80 t sg_get_request
c4850ba8 t sg_add_request
c4850c38 t sg_remove_request
c4850ca0 t sg_add_sfp
c4850d84 t sg_remove_sfp
c4850e6c t sg_fb_in_use
c4850e90 t sg_low_malloc
c4851028 t sg_malloc
c4851184 t sg_low_free
c485120c t sg_free
c485124c t sg_clr_scpnt
c4851274 t .fixup
c485128c r .rodata
c48512a8 r size_sg_header
c485174c r __ex_table
c485175c d .data
c485175c d sg_version_str
c4851760 D sg_big_buff
c4851764 d sg_pool_secs_avail
c4851768 D sg_template
c4851794 d sg_dev_arr
c4851798 d sg_fops
c48517d4 d sg_registered.488
c48517f4 d .bss

schleim:/lib/modules/2.2.6/cdrom # insmod -m cdrom.o
Sections: Size Address Align
.this 0000004c c4853000 2**2
.text 00001d3b c485304c 2**2
.fixup 000000bc c4854d87 2**0
.rodata 00000a7f c4854e43 2**0
__ex_table 00000078 c48558c4 2**2
.kstrtab 00000044 c485593c 2**0
__ksymtab 00000020 c4855980 2**2
.data 00000550 c48559a0 2**2
.bss 00000004 c4855ef0 2**2

Symbols:
00000000 a cdrom.c
c4853000 d __this_module
c485304c t gcc2_compiled.
c485304c T register_cdrom
c485304c t .text
c485321c T unregister_cdrom
c48532b0 t cdrom_find_device
c48532d0 t cdrom_open
c4853398 t open_for_data
c4853654 t check_for_audio_disc
c48537d8 t cdrom_release
c48538fc t media_changed
c4853978 t cdrom_media_changed
c48539b4 T cdrom_count_tracks
c4853b04 t sanitize_format
c4853b6c t cdrom_ioctl
c48548fc T cdrom_sysctl_info
c4854cf4 t cdrom_procfs_modcount
c4854d24 t cdrom_sysctl_register
c4854d5c t cdrom_sysctl_unregister
c4854d6c t init_module
c4854d74 t cleanup_module
c4854d87 t .fixup
c4854e43 r .rodata
c48558c4 r __ex_table
c485593c r .kstrtab
c485593c R __kstrtab_cdrom_count_tracks
c485594f R __kstrtab_register_cdrom
c485595e R __kstrtab_unregister_cdrom
c485596f R __kstrtab_cdrom_fops
c4855980 R __ksymtab_cdrom_count_tracks
c4855980 r __ksymtab
c4855988 R __ksymtab_register_cdrom
c4855990 R __ksymtab_unregister_cdrom
c4855998 R __ksymtab_cdrom_fops
c48559a0 d debug
c48559a0 d .data
c48559a4 d keeplocked
c48559a8 d autoclose
c48559ac d autoeject
c48559b0 d lockdoor
c48559b4 d check_media_type
c48559b8 d topCdromPtr
c48559bc D cdrom_fops
c48559f8 d banner_printed.298
c48559f9 d cdrom_drive_info
c4855de4 D cdrom_table
c4855e3c D cdrom_cdrom_table
c4855e94 D cdrom_root_table
c4855eec d initialized.327
c4855ef0 d .bss
c4855ef0 d cdrom_sysctl_header

schleim:/tmp # ksymoops -m /usr/src/linux/System.map <panic2
Options used: -V (default)
-o /lib/modules/2.2.6/ (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-m /usr/src/linux/System.map (specified)
-c 1 (default)

OWarning in compare_ksyms_lsmod, module nfs is in lsmod but not in ksyms, probably no symbols exported
ops: 0000
CPU: 0
EIP: 0010:[<c48645fb>]
EFLAGS: 00010046
eax: 00000000 ebx: 00000000 ecx: c009a800 edx: c1acc000
esi: c0087600 edi: c0087600 ebp: 0000000c esp: c01ebea0
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, process nr: 0, stackpage=c01eb000)
Stack: 00000793 00000028 001090dc c0087000 00000004 00000000 00000018 28000002
c018f2d2 c1acc000 c009a878 c009a878 00000286 c01ebf64 c009a878 00000000
00000002 c009a800 c0199ee3 c1acc000 c009a878 c01a00b6 c009a878 c02ba7e0
Call Trace: [<c018f2d2>] [<c0199ee3>] [<c01a00b6>] [<c0108ba9>] [<c0108967>] [<c0108c9f>] [<c0107ba8>]
[<c0106339>] [<c0106000>] [<c010635c>] [<c0107b0c>] [<c0106000>] [<c010607b>] [<c0106000>] [<c0100176>]
Code: 66 8b 43 7e 66 85 c0 74 4c 8b bb 88 00 00 00 31 f6 66 85 c0

>>EIP: c48645fb <END_OF_CODE+2e69f/????>
Trace: c018f2d2 <scsi_old_done+5de/5ec>
Trace: c0199ee3 <aic7xxx_done_cmds_complete+2b/38>
Trace: c01a00b6 <do_aic7xxx_isr+5e/78>
Trace: c0108ba9 <handle_IRQ_event+31/68>
Trace: c0108967 <do_8259A_IRQ+77/a0>
Trace: c0108c9f <do_IRQ+23/40>
Trace: c0107ba8 <ret_from_intr+0/20>
Trace: c0106339 <cpu_idle+5d/6c>
Code: c48645fb <END_OF_CODE+2e69f/????> 00000000 <_EIP>: <===
Code: c48645fb <END_OF_CODE+2e69f/????> 0: 66 8b 43 7e movw 0x7e(%ebx),%ax <===
Code: c48645ff <END_OF_CODE+2e6a3/????> 4: 66 85 c0 testw %ax,%ax
Code: c4864602 <END_OF_CODE+2e6a6/????> 7: 74 4c je c4864650 <END_OF_CODE+2e6f4/????>
Code: c4864604 <END_OF_CODE+2e6a8/????> 9: 8b bb 88 00 00 movl 0x88(%ebx),%edi
Code: c4864609 <END_OF_CODE+2e6ad/????> e: 00
Code: c486460a <END_OF_CODE+2e6ae/????> f: 31 f6 xorl %esi,%esi
Code: c486460c <END_OF_CODE+2e6b0/????> 11: 66 85 c0 testw %ax,%ax

Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
In swapper task - not syncing

1 warning issued. Results may not be reliable.

Here's the unprocessed Oops:
Oops: 0000
CPU: 0
EIP: 0010:[<c48645fb>]
EFLAGS: 00010046
eax: 00000000 ebx: 00000000 ecx: c009a800 edx: c1acc000
esi: c0087600 edi: c0087600 ebp: 0000000c esp: c01ebea0
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, process nr: 0, stackpage=c01eb000)
Stack: 00000793 00000028 001090dc c0087000 00000004 00000000 00000018 28000002
c018f2d2 c1acc000 c009a878 c009a878 00000286 c01ebf64 c009a878 00000000
00000002 c009a800 c0199ee3 c1acc000 c009a878 c01a00b6 c009a878 c02ba7e0
Call Trace: [<c018f2d2>] [<c0199ee3>] [<c01a00b6>] [<c0108ba9>] [<c0108967>] [<c
0108c9f>] [<c0107ba8>]
[<c0106339>] [<c0106000>] [<c010635c>] [<c0107b0c>] [<c0106000>] [<c01060
7b>] [<c0106000>] [<c0100176>]
Code: 66 8b 43 7e 66 85 c0 74 4c 8b bb 88 00 00 00 31 f6 66 85 c0
Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
In swapper task - not syncing

----------------------------------------
[6.] A small shell script or example program which triggers the
problem (if possible)

a. dd if=/dev/zero of=cd_image bs=1024k count=650
b. mke2fs -F cd_image
c. mount -t ext2 -o loop cd_image /mnt
d. tar cfpz /mnt/file.tgz /usr /opt /home (or whatever fits on it)
e. umount /mnt
f. cdrecord -v dev=<id>, o cd_image
g. Take the CD and a knife. Make a scratch on the CD not too
close to the beginning, so you can still mount the CD but
there are errors on it.
h. mount -t ext2 /dev/cdrom /mnt
i. tar tvfz /mnt/file.tgz >/dev/null

----------------------------------------
[7.] Environment

[7.1.] Software (add the output of the ver_linux script here)
schleim:/usr/src/linux/scripts # sh ver_linux
-- Versions installed: (if some fields are empty or looks
-- unusual then possibly you have very old versions)
Linux schleim 2.2.6 #1 Sun Apr 18 12:54:36 MEST 1999 i586 unknown
Kernel modules 2.1.85
Gnu C 2.7.2.3
Binutils 2.9.1.0.15
Linux C Library x 1 root root 2478585 Jan 14 22:29 /lib/libc.so.6
Dynamic linker ldd (GNU libc) 2.0.7
Procps 1.2.7
Mount 2.9
Net-tools 1.46
Kbd 0.96
Sh-utils 1.12
Modules Loaded ppp slhc nfs lockd sunrpc ne 8390 dummy0 serial unix

[7.2.] Processor information (from /proc/cpuinfo):
schleim:/proc # cat cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 5
model : 4
model name : Pentium MMX
stepping : 3
cpu MHz : 200.456338
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : yes
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr mce cx8 mmx
bogomips : 399.77

[7.3.] Module information (from /proc/modules):
schleim:/proc # cat modules
ppp 17736 0 (autoclean)
slhc 4128 0 (autoclean) [ppp]
nfs 27868 1 (autoclean)
lockd 28932 1 (autoclean) [nfs]
sunrpc 47864 1 (autoclean) [nfs lockd]
ne 5984 1 (autoclean)
8390 5944 0 (autoclean) [ne]
dummy0 720 1 (autoclean)
serial 18112 0 (autoclean)
unix 10056 61 (autoclean)

[7.4.] SCSI information (from /proc/scsi/scsi)
schleim:/proc/scsi # cat scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: QUANTUM Model: FIREBALL_TM3200S Rev: 300X
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 03 Lun: 00
Vendor: PIONEER Model: CD-ROM DR-U12X Rev: 1.06
Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 04 Lun: 00
Vendor: SCSI-CD Model: ReWritable-2x2x6 Rev: 2.00
Type: CD-ROM ANSI SCSI revision: 02

[7.5.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):

schleim:/proc/ide # cat drivers
ide-disk version 1.08

schleim:/proc/ide/ide0 # cat model
pci

schleim:/proc/ide/ide0 # cat mate
(none)

schleim:/proc/ide/ide0 # cat config
pci bus 00 device 39 vid 8086 did 7111 channel 0
86 80 11 71 05 00 80 02 01 80 01 01 00 20 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
01 f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
07 a3 00 80 33 00 00 00 01 00 02 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 28 0f 00 00 00 00 00 00

schleim:/proc/ide/ide0 # cat channel
0

schleim:/proc/ide/ide0/hda # cat model
IBM-DTTA-351680

schleim:/proc/ide/ide0/hda # cat capacity
33022080

schleim:/proc/ide/ide0/hda # cat settings
name value min max mode
---- ----- --- --- ----
bios_cyl 2055 0 65535 rw
bios_head 255 0 255 rw
bios_sect 63 0 63 rw
breada_readahead 4 0 127 rw
bswap 0 0 1 r
file_readahead 72 0 2097151 rw
io_32bit 0 0 3 rw
keepsettings 0 0 1 rw
max_kb_per_request 122 1 127 rw
multcount 0 0 8 rw
nice1 1 0 1 rw
nowerr 0 0 1 rw
pio_mode write-only 0 255 w
slow 0 0 1 rw
unmaskirq 0 0 1 rw
using_dma 1 0 1 rw

----------------------------------------
[X.] Other notes, patches, fixes, workarounds:

I think it'd be interesting how other file systems that
can be toasted on a CD behave (iso9660, minix, ...).

I'd also like to note that this is could be a way to test
the way file systems handle corrupted blocks.

The only reason "root" is involved in this problem is
that "root" had to mount the CD, but there could
also be an entry like:
/dev/scd0 /cdrom ext2 ro,noauto,user 0 0

This way, any ordinary user could crash the system.

----------------------------------------

Please don't hesitate to write if you need further
information or tests. I'd be glad to help you find
the error.

Please note that the complete kernel, the System.map
and the kernel configuration file that i used to
compile the kernel and the Oops output are available
at:

http://www.stuttgart.netsurf.de/~tmohr/bug/bugreport.html
(There is NO index.html, so you have to use this complete URL).

Best regards,
tmohr@stuttgart.netsurf.de

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/