Chen, Kenneth W wrote on Tuesday, October 05, 2004 10:31 AMWe have experimented with similar thing, via bumping up sd->cache_hot_time to
a very large number, like 1 sec. What we measured was a equally low throughput.
But that was because of not enough load balancing.
Since we are talking about load balancing, we decided to measure various
value for cache_hot_time variable to see how it affects app performance. We
first establish baseline number with vanilla base kernel (default at 2.5ms),
then sweep that variable up to 1000ms. All of the experiments are done with
Ingo's patch posted earlier. Here are the result (test environment is 4-way
SMP machine, 32 GB memory, 500 disks running industry standard db transaction
processing workload):
cache_hot_time | workload throughput
--------------------------------------
2.5ms - 100.0 (0% idle)
5ms - 106.0 (0% idle)
10ms - 112.5 (1% idle)
15ms - 111.6 (3% idle)
25ms - 111.1 (5% idle)
250ms - 105.6 (7% idle)
1000ms - 105.4 (7% idle)
Clearly the default value for SMP has the worst application throughput (12%
below peak performance). When set too low, kernel is too aggressive on load
balancing and we are still seeing cache thrashing despite the perf fix.
However, If set too high, kernel gets too conservative and not doing enough
load balance.
This value was default to 10ms before domain scheduler, why does domain
scheduler need to change it to 2.5ms? And on what bases does that decision
take place? We are proposing change that number back to 10ms.