Re: [RFC][PATCH 00/18] Increase resolution of load weights

From: Ingo Molnar
Date: Thu Apr 21 2011 - 02:16:55 EST



* Nikhil Rao <ncrao@xxxxxxxxxx> wrote:

> Major TODOs:
> - Detect overflow in update shares calculations (time * load), and set load_avg
> to maximum possible value (~0ULL).
> - tg->task_weight uses an atomic which needs to be updates to 64-bit on 32-bit
> machines. Might need to add a lock to protect this instead of atomic ops.
> - Check wake-affine math and effective load calculations for overflows.
> - Needs more testing and need to ensure fairness/balancing is not broken.

Please measure micro-costs accurately as well, via perf stat --repeat 10 or so.

For example, on a testsystem doing 200k pipe triggered context switches (100k
pipe ping-pongs) costs this much:

$ taskset 1 perf stat --repeat 10 ./pipe-test-100k

630.908390 task-clock-msecs # 0.434 CPUs ( +- 0.499% )
200,001 context-switches # 0.317 M/sec ( +- 0.000% )
0 CPU-migrations # 0.000 M/sec ( +- 66.667% )
145 page-faults # 0.000 M/sec ( +- 0.253% )
1,374,978,900 cycles # 2179.364 M/sec ( +- 0.516% )
1,373,646,429 instructions # 0.999 IPC ( +- 0.134% )
264,223,224 branches # 418.798 M/sec ( +- 0.134% )
16,613,988 branch-misses # 6.288 % ( +- 0.755% )
204,162 cache-references # 0.324 M/sec ( +- 18.805% )
5,152 cache-misses # 0.008 M/sec ( +- 21.280% )

We want to know the delta in the 'instructions' value resulting from the patch
(this can be measured very accurately) and we also want to see the 'cycles'
effect - both can be measured pretty accurately.

I've attached the testcase - you might need to increase the --repeat value so
that noise drops below the level of the effect from these patches. (the effect
is likely in the 0.01% range)

It would also be nice to see how 'size vmlinux' changes with these patches
applied, on a 'make defconfig' build.

Thanks,

Ingo

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <sys/wait.h>
#include <linux/unistd.h>

#define LOOPS 100000

int main (void)
{
unsigned long long t0, t1;
int pipe_1[2], pipe_2[2];
int m = 0, i;

pipe(pipe_1);
pipe(pipe_2);

if (!fork()) {
for (i = 0; i < LOOPS; i++) {
read(pipe_1[0], &m, sizeof(int));
write(pipe_2[1], &m, sizeof(int));
}
} else {
for (i = 0; i < LOOPS; i++) {
write(pipe_1[1], &m, sizeof(int));
read(pipe_2[0], &m, sizeof(int));
}
}

return 0;
}