RE: Default cache_hot_time value back to 10ms

From: Chen, Kenneth W
Date: Wed Oct 06 2004 - 12:25:32 EST


> Chen, Kenneth W wrote on Tuesday, October 05, 2004 10:31 AM
> > We have experimented with similar thing, via bumping up sd->cache_hot_time to
> > a very large number, like 1 sec. What we measured was a equally low throughput.
> > But that was because of not enough load balancing.
>
> Since we are talking about load balancing, we decided to measure various
> value for cache_hot_time variable to see how it affects app performance. We
> first establish baseline number with vanilla base kernel (default at 2.5ms),
> then sweep that variable up to 1000ms. All of the experiments are done with
> Ingo's patch posted earlier. Here are the result (test environment is 4-way
> SMP machine, 32 GB memory, 500 disks running industry standard db transaction
> processing workload):
>
> cache_hot_time | workload throughput
> --------------------------------------
> 2.5ms - 100.0 (0% idle)
> 5ms - 106.0 (0% idle)
> 10ms - 112.5 (1% idle)
> 15ms - 111.6 (3% idle)
> 25ms - 111.1 (5% idle)
> 250ms - 105.6 (7% idle)
> 1000ms - 105.4 (7% idle)
>
> Clearly the default value for SMP has the worst application throughput (12%
> below peak performance). When set too low, kernel is too aggressive on load
> balancing and we are still seeing cache thrashing despite the perf fix.
> However, If set too high, kernel gets too conservative and not doing enough
> load balance.

Ingo Molnar wrote on Wednesday, October 06, 2004 12:48 AM
> could you please try the test in 1 msec increments around 10 msec? It
> would be very nice to find a good formula and the 5 msec steps are too
> coarse. I think it would be nice to test 7,9,11,13 msecs first, and then
> the remaining 1 msec slots around the new maximum. (assuming the
> workload measurement is stable.)

I should've post the whole thing yesterday, we had measurement of 7.5 and
12.5 ms. Here is the result (repeating 5, 10, 15 for easy reading).

5 ms 106.0
7.5 ms 110.3
10 ms 112.5
12.5 ms 112.0
15 ms 111.6


> > This value was default to 10ms before domain scheduler, why does domain
> > scheduler need to change it to 2.5ms? And on what bases does that decision
> > take place? We are proposing change that number back to 10ms.
>
> agreed. What value does cache_decay_ticks have on your box?


I see all the fancy calculation with cache_decay_ticks on x86, but nobody
actually uses it in the domain scheduler. Anyway, my box has that value
hard coded to 10ms (ia64).

- Ken


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/