Re: howto combat highly pathologic latencies on a server?
From: David Rees
Date: Wed Mar 10 2010 - 18:45:25 EST
On Wed, Mar 10, 2010 at 9:17 AM, Hans-Peter Jansen <hpj@xxxxxxxxx> wrote:
> While this system usually operates fine, it suffers from delays, that are
> displayed in latencytop as: "Writing page to disk: 8425,5 ms":
> ftp://urpla.net/lat-8.4sec.png, but we see them also in the 1.7-4.8 sec
> range: ftp://urpla.net/lat-1.7sec.png, ftp://urpla.net/lat-2.9sec.png,
> ftp://urpla.net/lat-4.6sec.png and ftp://urpla.net/lat-4.8sec.png.
>
> From other observations, this issue "feels" like it is induced by single
> syncronisation points in the block layer, eg. if I create heavy IO load on
> one RAID array, say resizing a VMware disk image, it can take up to a
> minute to log in by ssh, although the ssh login does not touch this area at
> all (different RAID arrays). Note, that the latencytop snapshots above are
> made during normal operation, not this kind of load..
>
> Might later kernels mitigate this problem? As this is a production system,
> that is used 6.5 days a week, I cannot do dangerous experiments, also
> switching to 64 bit is a problem due to the legacy stuff described above...
> OTOH, my users suffer from this, and anything helping in this respect is
> highly appreciated.
Seems like a 2.6.32 based kernel which has per-BDI writeback and "CFQ
low latency mode" changes might help a good deal. I know that on one
of my bigger machines (similar in specs to yours) which has a lot of
processes which do a decent amount of IO, latency and load average has
gone down after going to a 2.6.32 kernel from a 2.6.31 kernel (Fedora
11 system).
Like Chris suggested, I've also heard that using the noop IO scheduler
can work well on Areca controllers on some kernels and workloads.
It's worth a shot and you can even try changing it at run-time.
-Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/