Re: The ext3 way of journalling
From: Theodore Tso
Date: Sun Jan 13 2008 - 18:12:08 EST
On Mon, Jan 14, 2008 at 12:23:10AM +0200, Tuomo Valkonen wrote:
> On 2008-01-14 00:13 +0200, Tuomo Valkonen wrote:
> > Also, I must say that e2fsck is brain-damaged, if it can be confused
> > by/do the stupid then when the system clock has warped by just a few
> > hours, not the _days_ that a file system check interval typically is,
> > and users need to specifically kludge around such misbehaviour in
> > e2fsck.
>
> Just to clarify, I had about 60 days of uptime, and hence at least
> 60 days since the last FS check/mount/etc., when Linux crashed those
> few days ago, and wanted to start checking disks with "9192 days since
> last file system check".
Well, let's see. 9192 days is a little over 25 years, so that means
the filesystem was marked as having done an fsck in 2008-25 or roughly
1983. If you're not seeing any other corruption when e2fsck runs,
it's highly unlikely that the superblock is getting corrupted. It's
much more likely that this early in your boot cycle, your clock is
sometimes incorrect.
My suggestion to you is to rig your init scripts to print out the the
current time/date using "/bin/date" and to print out the superblock
information using "dumpe2fs -h /dev/hdXX" and record the information
someplace useful. A simple way to do this would be via the following
command inserted into /etc/init.d/checkroot.sh:
(date; /sbin/dumpe2fs -h /dev/XXX) | logsave -a /var/log/boot-debug -
where you've replaced /dev/XXX with the block device of the filesystem
which keeps on getting checked erroneously.
All I can say is that most people aren't see what you're seeing, so
there is something unique about your system which is causing this
problem to show up. 9192 days means it's not the time going backwards
scenario; somehow the last checked value is getting set to some very
bogus value. Normally the only way this could happen is for the time
to be set to a bogus value (i.e., 1982) when the filesystem check
takes place. Is the "9192" number roughly constant, or is it always
changing?
I wonder if the battery-backed hardware clock in your system is
busted, and so you're always starting the system with some completely
bogus time. If your machine is on the network, then the "ntpdate"
program could be setting your time so that it looks correct, but
that's after e2fsck is run. If you really, really, can't guarantee
that the time on your system is correct in early boot, about the only
thing you really *can* do is to use the command "tune2fs -i 0
/dev/XXX" and disable time-based checks altogether.
Regards and best of luck,
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/