Why is get_user_pages so slow?
From: Micha Nelissen
Date: Tue Aug 10 2010 - 15:59:19 EST
Hi all,
Why is get_user_pages much slower than taking the faults? (I would
expect it to be faster).
Attached example program first mallocs a piece of memory (64MB in this
case) then reads it "to take the faults". Afterwards, it uses mmap with
MAP_POPULATE to "speed up" and not to have to take the faults, but have
everything mapped in one go. I think mmap is using get_user_pages in
this case.
$ ./memspeed
malloc took 0 msecs
read took 14 msecs
write took 0 msecs
free took 1 msecs
mmap took 45 msecs
munmap took 5 msecs
Using MAP_POPULATE is 3 times as slow as the 'stupid' implementation!
I'm running a Core 2 duo e6300 system with linux 2.6.28.4.
Am I doing something wrong? MAP_POPULATE seems a bit of a joke to me.
Thanks,
Micha
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/time.h>
#define SIZE (64 << 20)
unsigned tv_msecs(struct timeval *tvs, struct timeval *tve)
{
return (tve->tv_sec - tvs->tv_sec) * 1000 +
(tve->tv_usec - tvs->tv_usec) / 1000;
}
int main(void)
{
struct timeval tvs, tve;
void *buf;
int *p, *e, i;
gettimeofday(&tvs, NULL);
buf = malloc(SIZE);
gettimeofday(&tve, NULL);
printf("malloc took %d msecs\n", tv_msecs(&tvs, &tve));
gettimeofday(&tvs, NULL);
for (p = buf, e = buf + SIZE; p < e; p += 0x1000)
i = *(volatile int*)p;
gettimeofday(&tve, NULL);
printf("read took %d msecs\n", tv_msecs(&tvs, &tve));
gettimeofday(&tvs, NULL);
for (p = buf, e = buf + SIZE; p < e; p += 0x1000)
*(volatile int*)p = 0xaa55aa55;
gettimeofday(&tve, NULL);
printf("write took %d msecs\n", tv_msecs(&tvs, &tve));
gettimeofday(&tvs, NULL);
free(buf);
gettimeofday(&tve, NULL);
printf("free took %d msecs\n", tv_msecs(&tvs, &tve));
gettimeofday(&tvs, NULL);
buf = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE |
MAP_ANONYMOUS | MAP_POPULATE, -1, 0);
gettimeofday(&tve, NULL);
printf("mmap took %d msecs\n", tv_msecs(&tvs, &tve));
gettimeofday(&tvs, NULL);
munmap(buf, SIZE);
gettimeofday(&tve, NULL);
printf("munmap took %d msecs\n", tv_msecs(&tvs, &tve));
return 0;
}