Re: Why is get_user_pages so slow?

From: Kevin Easton
Date: Wed Aug 11 2010 - 23:50:40 EST


Quoting Micha Nelissen <micha@xxxxxxxxxxxxxx>:

Hi all,

Why is get_user_pages much slower than taking the faults? (I would expect it to be faster).

Attached example program first mallocs a piece of memory (64MB in this case) then reads it "to take the faults". Afterwards, it uses mmap with MAP_POPULATE to "speed up" and not to have to take the faults, but have everything mapped in one go. I think mmap is using get_user_pages in this case.

$ ./memspeed
malloc took 0 msecs
read took 14 msecs
write took 0 msecs
free took 1 msecs
mmap took 45 msecs
munmap took 5 msecs

Using MAP_POPULATE is 3 times as slow as the 'stupid' implementation! I'm running a Core 2 duo e6300 system with linux 2.6.28.4.

Am I doing something wrong? MAP_POPULATE seems a bit of a joke to me.

Hi Micha,

Yep, you are. Because your pointer 'p' is a pointer to int, when you increment it by 0x1000 in your loops you are actually incrementing it by 0x1000 * sizeof(int) - so you're only actually touching one page in four.

If you change the types of 'buf', 'p' and 'e' to 'char *' then it touches every page - and (and least on my test box) the MAP_POPULATE case pulls ahead.

- Kevin



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/