Re: 2.2.8/2.3.0 disk I/O hang, appears to serialise following processes

Johannes Erdfelt (jerdfelt@sventech.com)
Sat, 15 May 1999 19:38:33 -0400


On Fri, May 14, 1999, David <david+nospam@killerlabs.com> wrote:
> Ok, I've seen another thread on this but haven't seen patches.
>
> This is how it happened to me. I was untarring X 3.3.3 source and
> reached 100% disk capacity so i hit ctrl-c in the untar rxvt which
> didn't get me anywhere. I went to another rxvt and started an rm of all
> I had just untarred. This didn't yield anything. Now with two terms
> hung, I went to a third (I couldn't start any more rxvts so I was now
> being very cautious about what I ran) and started capturing /proc info.
> Here's what I could grab. Sorry, by the end of this capture, all of my
> terminals had hung either on disk i/o or a some form of tty i/o (i
> assume).
>
> Again, this input is somewhat sparse and lacking because I couldn't read
> the disk or write to it.
>
> Some things to note; update was not running as per notes w/ bdflush.
> There is another problem with this system. I *must* disable APM, not
> just in the kernel, but in CMOS as well. This machine is an AMD
> K6-2/200 w/ 128megs, 3.5gig scsi/1.4g ide. It has a USB mouse on it
> which I know is also subject to a variety of bad bugs, indeed, switching
> out of X to console used to lockup the machine hard with no oops
> reports.
>
> To contrast this, I have the exact same kernel running on the machine
> next to it, running the same set of programs w/ libs etc. It doesn't
> have a USB mouse, *does* have APM compiled/enabled etc, etc. The
> differences being the scsi disk/usb mouse and I haven't run it out of
> space. It is also a pentium 166 overclocked to 200. It's been very
> very stable.

Scary. This also happened to me. I'm running 2.2.8 and I was untarring
the 2.3.2 sources and I ran out of disk space. Then the tar froze. I
then tried deleting another file off of the same parition and it hung.

kflushd, update and rm all froze in __wait_on_super, some zsh shells I
was trying to exit froze in __down_failed and the tar froze in
wakeup_bdflush.

There is definately a race condition somewhere in there.

update was running, and I assume from this post and others that it is
bad to have update running? It seems to make to difference looking at
this post.

I WAS able to read off of that parition at the time, just anything
trying to write would freeze. Atleast reading some man pages worked for
me :)

My machine is a K6-2/300 w/ 128megs of RAM, Adaptec 2940, partition was
on a SCSI disk. Although I've been hacking USB alot recently, it was NOT
loaded at the time. My machine also had APM compiled/enabled.

JE

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/