linux-2.6.18 reiserfs on raid1 blocks io until rebooted - why?

From: Clemens Koller
Date: Mon Oct 01 2007 - 09:46:31 EST


Hi, there!

Can someone please put some light on what happened here:

Short story:
linux kernel 2.6.18 block i/o on file access on reiserfs on raid1.

Long story:
Sorry, for the long mail... I just try to give all details.
On one of our old simple samba file servers, I ran into a
strange problem that programs like slocate, rsync, cp
start to block somewhere in the kernel io when they access files on
reiserfs on /dev/md0.

History:
1. Machine was up for about 200+ days without any problems.

2. On Sept. 05th we had a power outage because a electrician got stuck
somewhere in the cables. :-(

3. Machine rebooted, checked everything, nothing seems to be broken
(not even the electrician).
Kernel log said:
[...]
Sep 5 14:51:54 rio kernel: PDC20269: chipset revision 2
Sep 5 14:51:54 rio kernel: PDC20269: ROM enabled at 0x20000000
Sep 5 14:51:54 rio kernel: PDC20269: 100%% native mode on irq 11
Sep 5 14:51:54 rio kernel: ide2: BM-DMA at 0x9800-0x9807, BIOS settings: hde:pio, hdf:pio
Sep 5 14:51:55 rio kernel: ide3: BM-DMA at 0x9808-0x980f, BIOS settings: hdg:pio, hdh:pio
Sep 5 14:51:55 rio kernel: hde: Maxtor 7L300R0, ATA DISK drive
Sep 5 14:51:55 rio kernel: ide2 at 0xb000-0xb007,0xa802 on irq 11
Sep 5 14:51:56 rio kernel: hdg: Maxtor 7L300R0, ATA DISK drive
Sep 5 14:51:56 rio kernel: ide3 at 0xa400-0xa407,0xa002 on irq 11
Sep 5 14:51:56 rio kernel: hda: max request size: 128KiB
Sep 5 14:51:56 rio kernel: hda: 4124736 sectors (2111 MB) w/128KiB Cache, CHS=4092/16/63, UDMA(33)
Sep 5 14:51:56 rio kernel: hda: cache flushes not supported
Sep 5 14:51:56 rio kernel: hda: hda1 hda2
Sep 5 14:51:56 rio kernel: hdb: max request size: 128KiB
Sep 5 14:51:56 rio kernel: hdb: 8446032 sectors (4324 MB) w/496KiB Cache, CHS=14896/9/63, UDMA(33)
Sep 5 14:51:56 rio kernel: hdb: hdb1 hdb2
Sep 5 14:51:56 rio kernel: hde: max request size: 512KiB
Sep 5 14:51:56 rio kernel: hde: 586114704 sectors (300090 MB) w/16384KiB Cache, CHS=36483/255/63, UDMA(133)
Sep 5 14:51:56 rio kernel: hde: cache flushes supported
Sep 5 14:51:56 rio kernel: hde: hde1
Sep 5 14:51:56 rio kernel: hdg: max request size: 512KiB
Sep 5 14:51:56 rio kernel: hdg: 586114704 sectors (300090 MB) w/16384KiB Cache, CHS=36483/255/63, UDMA(133)
Sep 5 14:51:56 rio kernel: hdg: cache flushes supported
Sep 5 14:51:57 rio kernel: hdg: hdg1
[...]
Sep 5 14:51:59 rio kernel: md: linear personality registered for level -1
Sep 5 14:51:59 rio kernel: md: raid0 personality registered for level 0
Sep 5 14:51:59 rio kernel: md: raid1 personality registered for level 1
Sep 5 14:51:59 rio kernel: md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
Sep 5 14:51:59 rio kernel: md: bitmap version 4.39
Sep 5 14:51:59 rio kernel: TCP bic registered
Sep 5 14:51:59 rio kernel: Initializing IPsec netlink socket
Sep 5 14:51:59 rio kernel: NET: Registered protocol family 1
Sep 5 14:51:59 rio kernel: NET: Registered protocol family 10
Sep 5 14:51:59 rio kernel: IPv6 over IPv4 tunneling driver
Sep 5 14:51:59 rio kernel: NET: Registered protocol family 17
Sep 5 14:51:59 rio kernel: Using IPI Shortcut mode
Sep 5 14:51:59 rio kernel: Time: tsc clocksource has been installed.
Sep 5 14:52:00 rio kernel: md: Autodetecting RAID arrays.
Sep 5 14:52:00 rio kernel: md: autorun ...
Sep 5 14:52:00 rio kernel: md: considering hdg1 ...
Sep 5 14:52:00 rio kernel: md: adding hdg1 ...
Sep 5 14:52:00 rio kernel: md: adding hde1 ...
Sep 5 14:52:00 rio kernel: md: created md0
Sep 5 14:52:00 rio kernel: md: bind<hde1>
Sep 5 14:52:00 rio kernel: md: bind<hdg1>
Sep 5 14:52:00 rio kernel: md: running: <hdg1><hde1>
Sep 5 14:52:00 rio kernel: raid1: raid set md0 active with 2 out of 2 mirrors
Sep 5 14:52:00 rio kernel: md: ... autorun DONE.
Sep 5 14:52:00 rio kernel: ReiserFS: hda1: found reiserfs format "3.6" with standard journal
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: using ordered data mode
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: journal params: device hda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: checking transaction log (hda1)
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: replayed 9 transactions in 2 seconds
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: Using r5 hash to sort names
Sep 5 14:52:01 rio kernel: VFS: Mounted root (reiserfs filesystem) readonly.
Sep 5 14:52:01 rio kernel: Freeing unused kernel memory: 168k freed
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: Removing [585 34512 0x0 SD]..done
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: Removing [41180 15603 0x0 SD]..done
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: Removing [199 10243 0x0 SD]..done
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: Removing [4 10175 0x0 SD]..done
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: Removing [585 10164 0x0 SD]..done
Sep 5 14:52:01 rio kernel: ReiserFS: hda1: There were 5 uncompleted unlinks/truncates. Completed
Sep 5 14:52:01 rio kernel: ReiserFS: md0: found reiserfs format "3.6" with standard journal
Sep 5 14:52:01 rio kernel: ReiserFS: md0: using ordered data mode
Sep 5 14:52:01 rio kernel: ReiserFS: md0: journal params: device md0, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Sep 5 14:52:01 rio kernel: ReiserFS: md0: checking transaction log (md0)
Sep 5 14:52:01 rio kernel: ReiserFS: md0: replayed 1 transactions in 0 seconds
Sep 5 14:52:01 rio kernel: ReiserFS: md0: Using r5 hash to sort names
Sep 5 14:52:01 rio kernel: Adding 206632k swap on /dev/hda2. Priority:-1 extents:1 across:206632k
Sep 5 14:52:02 rio kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x45E1

Comments:
- hda1 is the system /, md0 is the raid1 mounted to /share.
- Harddisks are physically ok, checked with smartctl.

4. starting around Sept. 14th, nightly called slocate started to freeze in
state D. Frozen processes start to cumulate. Nobody realized problems.
Samba access to the machine was possible without problems.

5. Sept. 28th, I got called because the backup process
(rsync from parts of /dev/md0 to external usb hdd) failed...
rsync was also blocked in state D, along with all the dead 14 slocate
processes. Unable to kill these processes.

6. manual rsync to some /tmp/test blocked while it was indexing the
files on /dev/md0 at some unknown location.

7. strace rsync -verbose -... from /dev/md0 to /tmp/test2 hangs also after:

mkdir("/tmp/test2/simulation", 0700) = 0
lstat64("/tmp/test2/simulation", {st_mode=S_IFDIR|0700, st_size=40, ...}) = 0
open("simulation", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0755, st_size=384, ...}) = 0
mmap2(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1 0) = 0xbe7e8e000
getdents64(3,

The cursor just blinks after the (3, _
No reaction to user input.

8. A similar cp froze in the same folder, too.
I got no entries in the kernel log!

9. mounted /dev/md0 readonly (device was busy, of course)

10. reiserfsck /dev/md0 didn't tell any errors

11. shutdown -r now machine to 2.6.18. Log says:
Sep 28 20:42:52 rio kernel: md: Autodetecting RAID arrays.
Sep 28 20:42:52 rio kernel: Time: tsc clocksource has been installed.
Sep 28 20:42:53 rio kernel: md: autorun ...
Sep 28 20:42:53 rio kernel: md: considering hdg1 ...
Sep 28 20:42:53 rio kernel: md: adding hdg1 ...
Sep 28 20:42:53 rio kernel: md: adding hde1 ...
Sep 28 20:42:53 rio kernel: md: created md0
Sep 28 20:42:53 rio kernel: md: bind<hde1>
Sep 28 20:42:53 rio kernel: md: bind<hdg1>
Sep 28 20:42:53 rio kernel: md: running: <hdg1><hde1>
Sep 28 20:42:53 rio kernel: raid1: raid set md0 active with 2 out of 2 mirrors
Sep 28 20:42:53 rio kernel: md: ... autorun DONE.
Sep 28 20:42:54 rio kernel: ReiserFS: hda1: found reiserfs format "3.6" with standard journal
Sep 28 20:42:54 rio kernel: ReiserFS: hda1: using ordered data mode
Sep 28 20:42:54 rio kernel: ReiserFS: hda1: journal params: device hda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Sep 28 20:42:54 rio kernel: ReiserFS: hda1: checking transaction log (hda1)
Sep 28 20:42:54 rio kernel: ReiserFS: hda1: replayed 25 transactions in 4 seconds
Sep 28 20:42:54 rio kernel: ReiserFS: hda1: Using r5 hash to sort names
Sep 28 20:42:54 rio kernel: VFS: Mounted root (reiserfs filesystem) readonly.
Sep 28 20:42:54 rio kernel: Freeing unused kernel memory: 168k freed
Sep 28 20:42:55 rio kernel: Adding 206632k swap on /dev/hda2. Priority:-1 extents:1 across:206632k
Sep 28 20:42:55 rio kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Sep 28 20:42:55 rio kernel: hdc: drive_cmd: status=0x58 { DriveReady SeekComplete DataRequest }
Sep 28 20:42:55 rio kernel: ide: failed opcode was: 0xa1
Sep 28 20:44:42 rio kernel: ReiserFS: md0: found reiserfs format "3.6" with standard journal
Sep 28 20:44:52 rio kernel: ReiserFS: md0: using ordered data mode
Sep 28 20:44:52 rio kernel: ReiserFS: md0: journal params: device md0, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Sep 28 20:44:52 rio kernel: ReiserFS: md0: checking transaction log (md0)
Sep 28 20:44:52 rio kernel: ReiserFS: md0: Using r5 hash to sort names

/dev/hda1 had some replayed transactions again!

12. everything works fine again - no chance to reproduce that case.
reiserfsck'ed again without problems.

13. rsync backup from /dev/md0 to external usb hdd went fine.


Some Questions:
- Is that a known bug?
- How can I debug that problem if unknown.
(I can take that machine down for some hours if needed)
- I can change the kernel to i.e. the latest 2.6.22 if there is
no way to debug that thingy and see if it works better.

Some more details:
$ uname -a
Linux rio 2.6.18 #3 Mon Oct 9 18:17:38 CEST 2006 i686 pentium3 i386 GNU/Linux

$ lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 03)
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03)
00:04.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
00:04.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:04.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
00:04.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
00:0b.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50)
00:0b.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50)
00:0b.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 51)
00:0c.0 Mass storage controller: Promise Technology, Inc. 20269 (rev 02)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G100 [Productiva] AGP (rev 01)

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 2
cpu MHz : 451.024
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 902.79

$ mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Sat Oct 29 20:51:41 2005
Raid Level : raid1
Array Size : 293049600 (279.47 GiB 300.08 GB)
Used Dev Size : 293049600 (279.47 GiB 300.08 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Mon Oct 1 14:59:30 2007
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : b2d7f60d:008fa0df:136207f1:ae0e7a9e
Events : 0.432306

Number Major Minor RaidDevice State
0 33 1 0 active sync /dev/hde1
1 34 1 1 active sync /dev/hdg1

Thank you in advance,
--
Clemens Koller
__________________________________
R&D Imaging Devices
Anagramm GmbH
Rupert-Mayer-Straße 45/1
Linhof Werksgelände
D-81379 München
Tel.089-741518-50
Fax 089-741518-19
http://www.anagramm-technology.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/