Re: [PATCH] x86: Reduce the default HZ value

From: Chris Snook
Date: Thu May 07 2009 - 12:36:28 EST


On Tue, May 5, 2009 at 5:57 PM, Alok Kataria <akataria@xxxxxxxxxx> wrote:
>
> On Tue, 2009-05-05 at 14:21 -0700, H. Peter Anvin wrote:
>> Alok Kataria wrote:
>> > Hi,
>> >
>> > Given that there were no major objections that came up regarding
>> > reducing the HZ value in http://lkml.org/lkml/2009/4/27/499.
>> >
>> > Below is the patch which actually reduces it, please consider for tip.
>> >
>>
>> What is the benefit of this?
>
> I did some experiments on linux 2.6.29 guests running on VMware and
> noticed that the number of timer interrupts could have some slowdown on
> the total throughput on the system.
> A simple tight loop experiment showed that with HZ=1000 we took about
> 264sec to complete the loop and that same loop took about 255sec with
> HZ=100.
> You can find more information here http://lkml.org/lkml/2009/4/28/401

This is why certain niches, such as HPC users, often prefer HZ=100
kernels. For the rest of us, sacrificing a few percent CPU throughput
for significant latency gains is well worth it.

> And with HRT i don't see any downsides in terms of increased latencies
> for device timer's or anything of that sought.
>
>>
>> I can see at least one immediate downside: some timeout values in the
>> kernel are still maintained in units of HZ (like poll, I believe), and
>> so with a lower HZ value we'll have higher roundoff errors.
>
> If that at all is such a big problem shouldn't we think about moving to
> using schedule_hrtimeout for such cases rather than relying on jiffy
> based timeouts.
> The hrtimer explanation over here http://www.tglx.de/hrtimers.html
> also talks about where these HZ (timer wheel) based timeouts be used and
> shouldn't really be dependent on accurate timing.

But your patch doesn't do this. If you want us to merge a patch that
makes VMware systems faster, we're a lot more likely to take it if it
make everyone else's systems faster, or at least not slower.

> Also the default HZ value was 250 before this commit
>
> commit 5cb04df8d3f03e37a19f2502591a84156be71772
>  x86: defconfig updates
>
> And it was 250 for a very long time before that too. The commit log
> doesn't explain why the value was bumped up either.

250 was considered a compromise between 100 and 1000, but almost
everyone who cared just ended up using one or the other, and most of
them preferred 1000.

Given your use case, what you really need to do is get Red Hat,
Novell, et al. on the phone and ask them to ship kernels with HZ=100,
because the distributions do their own thing anyway. If you can
figure out a way to do that without harming latency, they'll be
thrilled.

-- Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/