Re: Default cache_hot_time value back to 10ms

From: Ingo Molnar
Date: Wed Oct 06 2004 - 02:49:51 EST



* Chen, Kenneth W <kenneth.w.chen@xxxxxxxxx> wrote:

> Chen, Kenneth W wrote on Tuesday, October 05, 2004 10:31 AM
> > We have experimented with similar thing, via bumping up sd->cache_hot_time to
> > a very large number, like 1 sec. What we measured was a equally low throughput.
> > But that was because of not enough load balancing.
>
> Since we are talking about load balancing, we decided to measure various
> value for cache_hot_time variable to see how it affects app performance. We
> first establish baseline number with vanilla base kernel (default at 2.5ms),
> then sweep that variable up to 1000ms. All of the experiments are done with
> Ingo's patch posted earlier. Here are the result (test environment is 4-way
> SMP machine, 32 GB memory, 500 disks running industry standard db transaction
> processing workload):
>
> cache_hot_time | workload throughput
> --------------------------------------
> 2.5ms - 100.0 (0% idle)
> 5ms - 106.0 (0% idle)
> 10ms - 112.5 (1% idle)
> 15ms - 111.6 (3% idle)
> 25ms - 111.1 (5% idle)
> 250ms - 105.6 (7% idle)
> 1000ms - 105.4 (7% idle)
>
> Clearly the default value for SMP has the worst application throughput (12%
> below peak performance). When set too low, kernel is too aggressive on load
> balancing and we are still seeing cache thrashing despite the perf fix.
> However, If set too high, kernel gets too conservative and not doing enough
> load balance.

could you please try the test in 1 msec increments around 10 msec? It
would be very nice to find a good formula and the 5 msec steps are too
coarse. I think it would be nice to test 7,9,11,13 msecs first, and then
the remaining 1 msec slots around the new maximum. (assuming the
workload measurement is stable.)

> This value was default to 10ms before domain scheduler, why does domain
> scheduler need to change it to 2.5ms? And on what bases does that decision
> take place? We are proposing change that number back to 10ms.

agreed. What value does cache_decay_ticks have on your box?

>
> Signed-off-by: Ken Chen <kenneth.w.chen@xxxxxxxxx>

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/