Re: High load average on disk I/O on 2.6.17-rc3

From: Arjan van de Ven
Date: Mon May 08 2006 - 08:37:32 EST

Next message: Hugh Dickins: "Re: [PATCH] fix can_share_swap_page() when !CONFIG_SWAP"
Previous message: Michael Holzheu: "Re: [PATCH] s390: Hypervisor File System"
In reply to: Avi Kivity: "Re: High load average on disk I/O on 2.6.17-rc3"
Next in thread: Bill Davidsen: "Re: High load average on disk I/O on 2.6.17-rc3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 2006-05-08 at 12:28 +0100, Russell King wrote:
> On Mon, May 08, 2006 at 01:22:36PM +0200, Arjan van de Ven wrote:
> > On Mon, 2006-05-08 at 13:13 +0200, Erik Mouw wrote:
> > > On Sun, May 07, 2006 at 09:50:39AM -0700, Andrew Morton wrote:
> > > > This is probably because the number of pdflush threads slowly grows to its
> > > > maximum. This is bogus, and we seem to have broken it sometime in the past
> > > > few releases. I need to find a few quality hours to get in there and fix
> > > > it, but they're rare :(
> > > >
> > > > It's pretty harmless though. The "load average" thing just means that the
> > > > extra pdflush threads are twiddling thumbs waiting on some disk I/O -
> > > > they'll later exit and clean themselves up. They won't be consuming
> > > > significant resources.
> > >
> > > Not completely harmless. Some daemons (sendmail, exim) use the load
> > > average to decide if they will allow more work.
> >
> > and those need to be fixed most likely ;)
>
> Why do you think that?

I think that because the characteristics of modern hardware don't make
"load" a good estimator for finding out if the hardware can take more
jobs.

To explain why I'm thinking this I first need to do an ascii art graph

100
% | ************************|
b | ****
u | ***
s | **
y | *
| *
| *
|*
+------------------------------------------
-> workload

on the Y axis is the percentage in use, on the horizontal axis the
amount of work that is done. (in the mail case, say emails per second).
Modern hardware has an initial ramp-up which is near linear in terms of
workload/use, but then a saturation area is reached at 100% where, even
though a system is 100% busy, more work can be added, upto a certain
point that I showed with a "|". This is due to the behavior of increased
batching that you get at higher utilization compared to the behavior at
lower utilizations. Both cpu caches, memory burst speeds-vs-latency, but
also disk streaming performance vs random seeks... all those will create
and increase this saturation space to the right. And all of those have
been increasing in the hardware the last 4+ years, with the result that
the saturation "reach" has increased to the right as well, by far.

How does this tie into "load" and using load for what exim/sendmail use
it for? Well.... Today "load" is a somewhat poor approximation of this
percentage-in-use[1], but... as per the graph and argument above, even
if it was a perfect representation of that, it still would not be a good
measure to determine if a system can do more work (per time unit) or
not.

[1] I didn't discuss the use of *what*; in reality that is a combination
of cpu, memory, disk and possibly network resources. Load tries to
combine cpu and disk into one number via a sheer addition; that's an
obviously rough estimate and I'm not arguing that it's not rough.

> Having a single CPU machine with a load average of 150 and still feel
> very interactive at the shell is extremely counter-intuitive.

Well it's also a sign that the cpu scheduler is prioritizing your shell
over the background "menial" work ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Hugh Dickins: "Re: [PATCH] fix can_share_swap_page() when !CONFIG_SWAP"
Previous message: Michael Holzheu: "Re: [PATCH] s390: Hypervisor File System"
In reply to: Avi Kivity: "Re: High load average on disk I/O on 2.6.17-rc3"
Next in thread: Bill Davidsen: "Re: High load average on disk I/O on 2.6.17-rc3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]