Antw: Re: Lower bound 0.05 on 15-Minute load?

From: Ulrich Windl
Date: Fri Jul 03 2015 - 02:13:12 EST


>>> Martin Steigerwald <martin@xxxxxxxxxxxx> schrieb am 02.07.2015 um 11:26 in
Nachricht <1479160.a5Vb4cJSSF@merkaba>:
> On Thursday 02 July 2015 10:50:13 Ulrich Windl wrote:
>> Hi!
>
> Hi Ulrich,
>
>> I'm not subscribed, so plese CC: me for your replies.
>>
>> When graphing the CPU load, I noticed that the 15-minute average never
>> drops below 0.05, while the 5-minute load and the 1-minute load does
>> (Kernel 3.0.101-0.47.52-xen of SLES11 on x86_64).
>
> Load average is *NOT* the CPU load although this is a very common
> misconception.

I think the correlation of 1-min, 5-min and 15-min values is independent of the actual meaning of the value.

>
> Load average indicates the amount of processes that are waiting to be
> scheduled / running (which is CPU saturation) *and* those that are waiting
> uninterruptable. You can have a high load average without much CPU
> utilizitation, for example by running 20 find processes on a /home on NFS.
>
> A high load can be CPU-bound but it doesn't need to be.

I knew.

>
> So a high load only can indicate that things are running more slowly, but
> not why, or well the why can be at least two things and does not need to be
> CPU.

How is that related to my complaint/question?

>
> Also the load is normalized to CPU cores.

Actually I don't think so, but that's also not related to the issue I reported. In know that HP-UX load was the average load of every CPU, while for Linux the load seemed to be the sum of all CPU loads, meaning a load of 4 is low for a 12-CPU machine. But that's all unrelated...

>
>> Ist that a known bug? Interactive call of "uptime" seems to confirm my
>> suspect: windl> uptime
>> 10:41am up 23 days 18:49, 1 user, load average: 0.08, 0.05, 0.05
>> windl> uptime
>> 10:48am up 23 days 18:56, 1 user, load average: 0.00, 0.04, 0.05
>> windl> cat /proc/loadavg
>> 0.00 0.04 0.05 1/108 9704
>>
>> I'll attach a sample graph.
>
> Why should it be? As you can see in the graph you have higher spikes with 1-
> minute average. As its just a average about one minute it more easily drops
> below 0,05. But the 5 minute and 15 minute avergage need more time to drop
> lower, so for it to become lower, you need longer times without spikes in
> load average.
>
> So its natural you get "flatter" curves with higher average. Average easily
> hide things like spikes.

Actually it seems my "mathematical eye" is better than yours: I have another graph that shows the problem even more clearly (same kernel and hardware, just another machine).

Regards,
Ulrich


Attachment: Load-15.png
Description: PNG image