Re: linux-kernel-digest V1 #91

cloister bell (cloister@chad.hhhh.org)
Sat, 24 Jun 1995 15:23:18 -0700 (PDT)


> >> Hi. I have been running 1.2.10 since it was released, and had no
> >> problems until yesterday. I was in minicom on one VC, and had several
> >> shells opened on other VC's. The machine was running for about 18hours.
> >> When I flipped out of minicom, and tried to type something, I got no
> >> response from any of the consoles. I flipped back to minicom, and could
> >> still type there.
> >
> >> I was forced to cold boot, and when the system came back, there were all
> >> sorts of filesystem errors, but they weren't repairable, unlike what
> >> usually happens. /bin/bash turned into an X man page, and /etc/groups
> >> had dissappeared. bash isn't my normal login shell, and I don't even
> >> think it was running. Also /usr/spool/uucp turned into a 100k file,
> >> after e2fsck did its best to repair what had been screwed up.
> >
> >> I just thought that someone might have some insight as to what may have
> >> caused this.
>
> >it sounds a little bit similar to what happened to my machine a few weeks
> >ago with 1.2.0. in my case, the theory goes, something overwrote the
> >buffer cache with junk. when the cache got flushed, serious filesystem
> >damage happened. in my case the disk's superblock, root directory inode,
> >and a few directories down into the filesystem got overwritten, which
> >made the disk totally unbootable. it sounds like something similar
> >happened to you, although at the time you had less critical portions of
> >the disk in the buffer cache.
>
> >of course, if that's happening, it is an extremely difficult thing to
> >debug. since the typical result of this is your system becomes unusable,
> >you have no idea why, so you reboot (sync'ing like you're supposed to,
> >which may make the problem worse), and destroy any evidence about what
> >caused the problem.
>
> OK, I had a problem *very* similar to the above mentioned (root
> fs getting trashed, etc) while using minicom (see first quote).
> I was using 1.2.9 at the time. It this a conincedence or is there
> something about minicom that could be causing this?
>
> Travis

i wasn't and never have used minicom, so that can't be the whole of the
problem. i still think that it's most likely to be a bug in some of the
disk buffering/caching code.