Re: Monitoring filesystems / blockdevice for errors

From: Jan-Benedict Glaw (jbglaw@lug-owl.de)
Date: Mon Dec 18 2000 - 03:46:32 EST


On Sun, Dec 17, 2000 at 11:28:46PM -0600, Peter Samuelson wrote:
> [Mark Hahn]
> > > reinventing /proc/kmsg and klogd would be tre gross.
>
> [Lars Marowsky-Bree]
> > Well, only one process can read kmsg and get notified about new
> > messages at any time, so that makes the monitoring depend on
> > klogd/syslogd working, which given a write error by syslog might not
> > be the case...
>
> So rewrite klogd to do something much simpler for serious errors (yes
> they will be tagged as such) before trying to pass them on to syslogd.
> Or does it already do this? It's a userspace problem.

Hmmm... Even if LMB and I are often of quite different oppinions, I
think only modifying klogd is not enough. LMB stated that a userspace
tool would need to know any possibly error messages that could
possibly generated. Cleaning up all messages would be the first
step to prepare for failure reports to userspace. Ie, what errors do
re have?

        - Sense errors (recoverable) \
        - " " (unrecoverable) > for all kinds of devices
        - Complete device failure (HDD is gone) /
        - Data failure (wrong ext2 bitmaps) for all FS
        - RAM's ECC/parity errors
        - possibly some more;)

Cleaning up all error messages (maybe using exctly two lines: one for kind
of failure, one for device/RAM/fs specific messages) could help a lot
and doesn't hurt badly (code doesn't get really slower as these paths
are more-or-less never taken; but there is a little bit more bloat...).

With such an infrastructure, klogd could pass those lines to an external
helper (and additionally to syslog).

MfG, JBG

-- 
Fehler eingestehen, Größe zeigen: Nehmt die Rechtschreibreform zurück!!!
/* Jan-Benedict Glaw <jbglaw@lug-owl.de> -- +49-177-5601720 */
keyID=0x8399E1BB fingerprint=250D 3BCF 7127 0D8C A444 A961 1DBD 5E75 8399 E1BB
     "insmod vi.o and there we go..." (Alexander Viro on linux-kernel)


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Dec 23 2000 - 21:00:20 EST