Re: hugetlb demand paging patch part [2/3]

From: 'David Gibson'
Date: Thu Apr 15 2004 - 23:52:41 EST


On Thu, Apr 15, 2004 at 09:13:38PM -0700, Chen, Kenneth W wrote:
> David Gibson wrote on Thursday, April 15, 2004 8:27 PM
> > Ah! So it's just an optimiziation - it makes a bit more sense to me
> > now. I had assumed that this case (hugepage get_user_pages()) would
> > be sufficiently rare that it would not require optimization.
> > Apparently not.
>
> It's a huge deal because for *every* I/O, kernel has to do get_user_pages()
> to lock the page, it's really gets in the way with the spin_lock as well.
>
> spin_lock(&mm->page_table_lock);
> do {
> struct page *map;
> int lookup_write = write;
> while (!(map = follow_page(mm, start, lookup_write))) {
>
> With current state of art platform, I/O requirement pushes into 200K
> per second, this become quite significant.

Ok. This makes sense now that you explain it.

> > Do you know where the cycles are going without this optimization? In
> > particular, could it be just the find_vma() in hugepage_vma() called
> > before follow_huge_addr()? I note that IA64 is the only arch to have
> > a non-trivial hugepage_vma()/follow_huge_addr() and that its
> > follow_huge_addr() doesn't actually use the vma passed in.
>
> That's one, plus the spin lock mentioned above.

And akpm has just explained why it can be avoided in the hugepage
case.

> > If we could get rid of follow_hugetlb_pages() it would remove an ugly
> > function from every arch, which would be nice.
>
> I hope the goal here is not to trim code for existing prefaulting scheme.
> That function has to go for demand paging, and demand paging comes with
> a performance price most people don't realize. If the goal here is to
> make the code prettier, I vote against that.

Well, I'm attempting to understand the hugepage code across all the
archs, so that I can try to implement copy-on-write with a minimum of
arch specific gunk. Simplifying and consolidating the existing code
across archs would be a helpful first step, if possible.

--
David Gibson | For every complex problem there is a
david AT gibson.dropbear.id.au | solution which is simple, neat and
| wrong.
http://www.ozlabs.org/people/dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/