Re: Can anyone tell me the meaning of this error and if I should

Theodore Y. Ts'o (tytso@MIT.EDU)
Wed, 18 Mar 1998 14:12:36 -0500


Hi Tom,

I just investigated the debugfs output you sent me. The block
numbers for that directory inode were clearly bogus; they were far too
big to be legal block numbers, given the size of your partition. What
they did look like were ASCII characters decoded as integers. So I
looked more closely, and it looks like a directory block got written
over that directory's indirect block. Here's a fragment of what those
block numbers decoded themselves into:

I=400283, rec_len=28, name_len=20
3010731339.388678371
I=400284, rec_len=32, name_len=21
2197947723.2223214451
I=400285, rec_len=28, name_len=19
1181609475.37011443
I=400286, rec_len=32, name_len=21
3611287028.2167611668
I=400287, rec_len=32, name_len=21
1044609811.1164349659

The $64,000, then, is how this happened. You said that you were getting
this error message often; was it always with this directory, and did you
run e2fsck to clean up the corrupted directory between instances of this
error showing up?

Explanations range from some kind of SMP locking bug, to a bug in the
SCSI driver, to a unclean SCSI bus with parity checking turned off
corrupted a block address, etc.

If when you reboot, the problem goes away --- e2fsck reports no
problems, and debugfs on the directory after the reboot shows no illegal
blocks associated with that inode, then it was a problem reading the
block into memory. It could indicate a problem with the buffer cache,
or again, possibly a hardware/SCSI bus introduced error.

We've occasionally had errors like this reported, but it's always been
a real devil to track down exactly what caused it.

Thanks for clearing that up. I also found another problem we have been
having that you might be able to shed some light on. On our news server,
we have been getting the following error messages:

Mar 17 10:17:36 nnrp2 kernel: EXT2-fs error (device 09:00): ext2_add_entry:
bad entry in directory #458756: rec_len is too small for name_len -
offset=5232, inode=6324492, rec_len=16, name_len=4101
Mar 17 10:17:36 nnrp2 kernel: EXT2-fs error (device 09:00): ext2_find_entry:
bad entry in directory #458756: rec_len is too small for name_len -
offset=5232, inode=6324492, rec_len=16, name_len=4101
Mar 17 10:17:36 nnrp2 kernel: EXT2-fs error (device 09:00): ext2_add_entry:
bad entry in directory #458756: rec_len is too small for name_len -
offset=5232, inode=6324492, rec_len=16, name_len=4101

Hmm... this again could be another case of a disk block getting written
to the wrong place.

Can you give us a full breakdown on the Linux kernel version number,
what Linux distribution (if any) was used, what the hardware on the two
boxes are (SCSI adapters, hard disks, etc.)? That kind of data is going
to be absolutely necessary to try track down what's going on here.

- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu