Re: mmap/mlock performance versus read

From: Linus Torvalds (torvalds@transmeta.com)
Date: Wed Apr 05 2000 - 21:34:50 EST


On Wed, 5 Apr 2000, Albert D. Cahalan wrote:
> Linus Torvalds writes:
>
> > - page faulting is expensive. That's how the mapping gets populated,
> > and it's quite slow.
>
> Could mmap get a flag that asks for async read and map?
> So mmap returns, then pages start to appear as the IO progresses.

It's not the IO on the pages themselves, it's actually the act of
populating the page tables that is quite costly. And doing that in the
background is basically impossible.

You can do it synchronously, and that is basically what mlock() will do
with "make_pages_present()". However, that path is not all that optimized
(not worth it), and even if it was hugely optimized it would _still_ be
quite slow. The page tables are just fairly complex data structures.

And on top of that you still have the actual CPU TLB miss costs etc. Which
can often be avoided if you just re-read into the same area instead of
being excessively clever with memory management just to avoid a copy.

memcpy() (ie "read()" in this case) is _always_ going to be faster in many
cases, just because it avoids all the extra complexity. While mmap() is
going to be faster in other cases.

                Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Apr 07 2000 - 21:00:16 EST