`The basic idea is that a definitive statement cannot be made about
the characteristics of all systems, but a propabilistic statement
about the range in which the characteristics of most systems would lie
can be made. The concept of confidence intervals introduced in this
chapter is one of the fundamental concepts that every performance
analyst needs to understand well. In the remainder of this book, most
conclusions drawn from samples are stated interms of confidence
intervals.' - Raj Jain.
Earlier in the book he mentions some pitfalls of perf.analysis,
one of which is not doing enough analysis after collecting data.
I will now take some time to explain what should be done with the
analysis of these kernel benchmarks. Someone with greater
understanding of performance analysis and statistics: please correct
me if I got it wrong.
In 13.4.2 comparing unpaired observations is explained. This is what
we are doing, as we have two sets of nA=nB=20 (or 10) observations
xA[0..19] and xB[0..19] for two different kernel releases A and B.
The steps of the so called t-test are:
1. compute the sample means meanA = sum(xA)/nA and meanB..
2. compute the sample standard deviations:
sA = sqrt((sum(xA**2)-nA*(meanA**2))/nA-1) and sB similarily.
3. compute the mean difference meanA-meanB
4. compute the standard deviation of the mean difference:
s = sqrt(sA**2/nA + sB**2/nB)
5. compute the effective number of degrees of freedom: (this gets tricky)
v = (sA**2/nA + sB**2/nB)/(sA**4/nA**2/(nA+1) + sB**4/nB**2/(nB+1)) - 2
6. compute the confidence interval for the mean difference.
That is, look at the table for a value for t=t[1-alpha/2,v],
where 1-alpha/2 is 0.95 for 90% confidence level etc.
Then the conf.interval is: (meanA-meanB) +- t*s.
7. If the confidence interval includes zero, the difference is not
significant at 100*(1-alpha) % confidence level. If the zero is
not included, the system with better mean value is better.
The alpha parameter above says how much uncertainty we can tolerate
when we want to see if the difference is significant. 0.1 or 0.05 are
commonly used for 90% and 95% confidence levels, but this does not say
what should we use. I do not know.
Now you only need some t[]-tables. You should find one in any book
of statistics.
Be careful out there, statistics is dangerous.
-- #Jaakko Hyvätti Jaakko.Hyvatti@WWW.FI http://www.fi/~jaakko/ +358 40 5011222 echo 'movl $36,%eax;int $128;movl $0,%ebx;movl $1,%eax;int $128'|as -o/bin/sync