Re: ext2 and unlink(2)

Theodore Y. Ts'o (tytso@MIT.EDU)
Tue, 30 Sep 1997 19:57:47 -0400


From: Nick.Holloway@alfie.demon.co.uk (Nick Holloway)
Date: 30 Sep 1997 21:04:58 +0100

Spooky. I read your message at home in the morning. Just over an hour
later, when I happened to be in the machine room, the main Linux machine
at work ate a filesystem.

attempt to access beyond end of device
16:01: rw=0, want=942814853, limit=768064
Kernel panic: EXT2-fs panic (device 16:01): ext2_write_inode:
unable to read i-node block - inode=331778, block=942814852
attempt to access beyond end of device
16:01: rw=0, want=892678965, limit=768064
Kernel panic: EXT2-fs panic (device 16:01): ext2_read_inode:
unable to read i-node block - inode=265721, block=892678964
attempt to access beyond end of device
16:01: rw=0, want=221853494, limit=768064
Kernel panic: EXT2-fs panic (device 16:01): read_block_bitmap:
Cannot read block bitmap - block_group = 71, block_bitmap = 221853493

Does this look like the same sort of problem?

No, that's something else. When the kernel went to read a particular
part of the inode table, the block number it tried to use (942814852)
was clearly "out of range" for the disk.

The locations of the inode table, bitmap devices, etc. are read into
memory when the filesystem is mounted, and they are validated to make
sure they are sane at mount time. Therefore, some kind of problem must
have corrupted the in-memory table created by the Linux kernel, later,
after the filesystem was mounted. (Your server was probably running
just for quite a while before the crash, right?)

Now, the block numbers 942814852, 892678964, and 22185349, if looked at
as four char bytes, are mostly ASCII: "826\204", "5534", "\r975". This
is not random garbage, so it's probably not memory bitrot. My money is
on either a misdirected DMA (maybe caused by a bus transient of some
kind) which overwrote the kernel data structure, or a kernel bug which
caused the table to get overwritten by trash.

- Ted