Re: Inconsistent load average on tickless kernels

From: Aman Gupta
Date: Mon Mar 05 2012 - 17:45:50 EST


On Mon, Mar 5, 2012 at 11:57 AM, LesÅaw KopeÄ
<leslaw.kopec@xxxxxxxxxxxxxx> wrote:
> On 29.02.2012 13:06, Peter Zijlstra wrote:
>
>> Missing here is a kernel build with CONFIG_NO_HZ but booted with
>> nohz=off; this would be an interesting data point because it includes
>> all the funny code but still ticks are the right frequency.
>
> You've asked for it and you got it. I have rebooted some servers with
> nohz=off parameter set on kernels complied with CONFIG_NO_HZ=y. They're
> the ones listed below with 'off' suffix.
>
> On 29.02.2012 17:24, Peter Zijlstra wrote:
>
>> Hrmm, this suggests we age too hard with nohz code.. in your test case
>> is there significant idle time? That is, suppose you run each cpu at 30%
>> what is the period of you load? Running 3s out of 10s is significantly
>> different from running .3ms out of 1ms.
>
> It's definitely more similar to the second case - very frequent, but
> short bursts of activity. A single process does a tiny bit of
> computation mixed with a fair amount of network activity on each
> request. There are 80 such processes which are responsible for majority
> of system load.
>
> On 29.02.2012 18:03, Peter Zijlstra wrote:
>
>>> The only thing I could find is that on nohz we can confuse the per-rq
>>> sample period, does the below make a difference?
>>
>> Uhm, something like so that is..
>>
>> ---
>> Âkernel/sched/core.c | Â Â3 ++-
>> Â1 files changed, 2 insertions(+), 1 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index d7c4322..44f61df 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2380,7 +2380,8 @@ static void calc_load_account_active(struct rq *this_rq)
>> Â Â Â if (delta)
>> Â Â Â Â Â Â Â atomic_long_add(delta, &calc_load_tasks);
>>
>> - Â Â this_rq->calc_load_update += LOAD_FREQ;
>> + Â Â while (!time_before(jiffies, this_rq->calc_load_update))
>> + Â Â Â Â Â Â this_rq->calc_load_update += LOAD_FREQ;
>> Â}
>>
>> Â/*
>>
>
> I have compiled another batch of kernels with this patch applied
> (they're the ones with 'patch0' suffix). The only difference was the
> patch had to go to kernel/sched.c, but that's what you get when not
> using the latest sources. Anyway, here are the results accompanied by a
> pretty picture [1]:
>
>                    Âstd   off   patch0
> 2.6.32.55-no-hz             0.76  Â0.91  Â-
> 2.6.32.55-no-hz-74f5187ac8 Â Â Â Â Â Â Â6.41 Â Â9.40 Â Â4.93
> 2.6.32.55-no-hz-0f004f5a69 Â Â Â Â Â Â Â0.78 Â Â0.92 Â Â0.90
> 2.6.37-rc5-no-hz-0f004f5a69 Â Â Â Â Â Â 0.95 Â Â0.92 Â Â0.86
> 2.6.37-rc5-no-hz-pre-0f004f5a69 Â Â Â Â 9.16 Â Â10.47 Â 8.02
>
> It seems that the patch didn't help much on kernels with 0f004f5a69
> applied. The ones with just 74f5187ac8 are reporting a more plausible
> values, but slightly lower than the ones without patch0. Am I right to
> assume that the correct load values are the ones produced by kernels
> complied with CONFIG_NO_HZ=n? Should they be the baseline?
>
> I can run additional tests if you have other leads to follow. Is there a
> particular kernel version I should focus on? If not I will continue
> to use the current bundle. I'm also planning to give the latest stable
> release a spin.

I can confirm these results on 3.2.8. Booting with nohz=off makes no
difference. Applying the patch to kernel/sched.c made no noticeable
difference either.

Aman

>
>
> [1] http://img835.imageshack.us/img835/2204/kernelload.png
>
> --
> LesÅaw KopeÄ
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/