Re: ext2 filesystem corruption?!?!??

Jeff Garzik (jeff.garzik@spinne.com)
Fri, 04 Apr 1997 15:18:37 -0500


Here's my machine config on which I've been having the filesystem
corruption problems.

- Pentium 90
- 128MB RAM
- onboard PCI EIDE
- 3 PCI NCR53C810 cards
- 5 SCSI drives, from 1-2gb
- 1 EIDE drive, 2.5gb

Kernel info:

- 2.0.25, 2.0.29
- NCR53c7,8xx drive compiled with no extra options
(no SCSI DISCONNECT, etc.)
- NO_ATIME patch included (the one that's now in pre-2.0.30-2)

Included below are the relevant probes, and a sample of the errors
produced.

Jeff

-----------------------------------------------------------------------

ide0: buggy cmd640b interface on PCI (type1), config=0x5e
ide1: not serialized, secondary interface not responding
cmd640: drive0 timings/prefetch(on) preserved, clocks=3/3/2
cmd640: drive1 timings/prefetch(on) preserved, clocks=3/3/3
hda: WDC AC32500H, 2441MB w/128kB Cache, LBA, CHS=620/128/63
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14

md driver 0.35 MAX_MD_DEV=4, MAX_REAL=8
raid0 personality registered

scsi-ncr53c7,8xx : at PCI bus 0, device 9, function 0
scsi-ncr53c7,8xx : NCR53c810 at memory 0xfbfef000, io 0xe800, irq 11
scsi0 : burst length 8
scsi0 : reset ccf to 3 from 0
scsi0 : NCR code relocated to 0x14600 (virt 0x00014600)
scsi0 : test 1 started
scsi-ncr53c7,8xx : at PCI bus 0, device 10, function 0
scsi-ncr53c7,8xx : NCR53c810 at memory 0xfbfee000, io 0xe400, irq 9
scsi1 : burst length 2
scsi1 : NCR code relocated to 0x10600 (virt 0x00010600)
scsi1 : test 1 started
scsi-ncr53c7,8xx : at PCI bus 0, device 11, function 0
scsi-ncr53c7,8xx : NCR53c810 at memory 0xfbfed000, io 0xe000, irq 12
scsi2 : burst length 2
scsi2 : NCR code relocated to 0x7ffc600 (virt 0x07ffc600)
scsi2 : test 1 started
scsi0 : NCR53c{7,8}xx (rel 17)
scsi1 : NCR53c{7,8}xx (rel 17)
scsi2 : NCR53c{7,8}xx (rel 17)
scsi : 3 hosts.

scsi0 : target 0 accepting asynchronous SCSI
scsi0 : setting target 0 to asynchronous SCSI
Vendor: QUANTUM Model: LIGHTNING 730S Rev: 241E
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
scsi0 : target 6 accepting asynchronous SCSI
scsi0 : setting target 6 to asynchronous SCSI
Vendor: QUANTUM Model: FIREBALL_TM2110S Rev: 300X
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdb at scsi0, channel 0, id 6, lun 0

scsi1 : target 3 accepting asynchronous SCSI
scsi1 : setting target 3 to asynchronous SCSI
Vendor: SEAGATE Model: ST31230N Rev: 0300
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdc at scsi1, channel 0, id 3, lun 0

scsi2 : target 4 accepting asynchronous SCSI
scsi2 : setting target 4 to asynchronous SCSI
Vendor: QUANTUM Model: PD1800S Rev: 3162
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sdd at scsi2, channel 0, id 4, lun 0
scsi2 : target 5 accepting asynchronous SCSI
scsi2 : setting target 5 to asynchronous SCSI
Vendor: QUANTUM Model: PD1800S Rev: 3162
Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sde at scsi2, channel 0, id 5, lun 0

scsi : detected 5 SCSI disks total.

SCSI device sda: hdwr sector= 512 bytes. Sectors= 1431760 [699 MB] [0.7 GB]
SCSI device sdb: hdwr sector= 512 bytes. Sectors= 4124736 [2014 MB] [2.0 GB]
SCSI device sdc: hdwr sector= 512 bytes. Sectors= 2069860 [1010 MB] [1.0 GB]
SCSI device sdd: hdwr sector= 512 bytes. Sectors= 3517856 [1717 MB] [1.7 GB]
SCSI device sde: hdwr sector= 512 bytes. Sectors= 3517856 [1717 MB] [1.7 GB]

-----------------------------------------------------------------------

Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/hda1 81944 8340 69372 11% /
/dev/hda5 398124 83798 314326 21% /var
/dev/hda6 152215 100280 44075 69% /usr
/dev/sda1 703706 251835 451871 36% /var/news/etc
/dev/md0 1368533 457885 910648 33% /var/news/spool/over.view
/dev/md1 6737769 3868644 2869125 57% /var/news/spool/articles

-----------------------------------------------------------------------

There were many more of these errors than I have included :)

Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_blocks_bitmap: Wrong free blocks count for group 119, stored = 7335, counted = 7336
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_blocks_bitmap: Wrong free blocks count for group 182, stored = 7303, counted = 7305
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_blocks_bitmap: Wrong free blocks count for group 183, stored = 6993, counted = 6997
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_blocks_bitmap: Wrong free blocks count for group 294, stored = 1306, counted = 1308
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_blocks_bitmap: Wrong free blocks count for group 383, stored = 7108, counted = 7109
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_inodes_bitmap: Wrong free inodes count in group 8, stored = 560, counted = 554
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_inodes_bitmap: Wrong free inodes count in group 119, stored = 1739, counted = 1740
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_inodes_bitmap: Wrong free inodes count in group 182, stored = 1790, counted = 1791
Mar 31 15:49:00 budokan kernel: EXT2-fs error (device 09:01): ext2_check_inodes_bitmap: Wrong free inodes count in super block, stored = 1443572, counted = 1443566
{ last entry in log before crash }