Re: 2.4.20: Proccess stuck in __lock_page ...

From: Marc-Christian Petersen (m.c.p@wolk-project.de)
Date: Thu May 29 2003 - 09:49:13 EST


On Thursday 29 May 2003 15:55, Willy Tarreau wrote:

Hi Willy,

> I've done a few tests with -rc6 on my dev machine (dual xp 1.5G, 512 MB,
> scsi). It's the *FIRST* time I have ever seen my mouse cursor hang (just a
> little bit however, and totally acceptable) ! Usually, my kernel include -aa
> VM and lowlat patches, and I've never encountered this behaviour on this
> machine with such a configuration. However, with stock kernel, I admit that
> during the 2 minutes it takes to write the 2G file, I see the mouse stick
> two or three times during about 1 second, which is quite acceptable IMHO.
WRONG. A mouse stick is not acceptable in _any_ way. Other OS' can handle this
pretty well, and if Linux has problems with mouse sticks, this has to be
fixed! Either in kernel space or in userspace (XFree86).

> Opening an xterm may take 10s to get to the prompt (more annoying). Same to
> launch 'ps'.
ACK!

> I retried with rc4aa1, and everything went very smooth again ; it takes at
> most 1 second to get an xterm with the prompt ready, and ps responds
> immediately. So I think that there are two things here:
> - those who experience very long hangs may use a heavy window manager
> which does continuous disk accesses (I mean it accesses the disk for
> any simple operation).
> - a hungry WM may also be swapped during such operations, rendering it
> totally unusable, particularly if the swap is on the same physical disk
> as the file being written to.
Well, sorry, but: no!

The pauses/stops occurs no matter of what WindowManager (KDE2/3, WindowMaker,
fvwm, gnome etc. foobar). The point why you are not seeing such things with
-aa is his Lowlatency Elevator and lowlatency-fixes and some important fixes
which are not in stock kernel yet.

I reproduced mouse sticks and keyboard does not accept anything problems for
$seconds with _every_ kernel which is based on 2.4.19/2.4.20/2.4.21*. This
also includes -AA (well, not that braindead bad like mainline did before the
fix) but this is because of lowlat elevator from Andrea. And as I told
yesterday (or 2 days ago? dunno) lowlat elevator drops throughput (Andrea, it
_does_ ;).

It's not just only mouse hangs (as I've reported tons of times) but also
keyboard does not accept any input (delay varies between 1 to 15 seconds) and
this also applies if you don't run X at all.

Another fine example is:

- Start a screen session, not running X at all.
- Trash your HD with tons of writes.
- Press Ctrl-A-C for a new screen session.

You will see, it takes as long as, you wrote above, with starting up an Xterm
or calling ps. It does _not_ happen with 2.4.18!

> So, could the people who report long hangs retry with swap disabled ?
It's somewhat better but not acceptable.

> Can we limit the amount of memory consummed by the cache during such a
> write ?
I ask for such a feature since years ;)

Well, my summary: The bug is there, for over 15 months ( I won't mention it
again that I've reported the bug 15 months ago ;-) ... It _may_ be some very
obscure hardware problem to be able to reproduce this bug but as this thread
shows up, there are tons of people who can reproduce this with different
hardware starting with 2.4.19-pre1.

ciao, Marc

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/