Re: [RFC] Demand faulting for large pages

From: Andi Kleen
Date: Fri Aug 05 2005 - 11:51:35 EST


On Fri, Aug 05, 2005 at 11:37:27AM -0500, Adam Litke wrote:
> On Fri, 2005-08-05 at 10:53, Andi Kleen wrote:
> > On Fri, Aug 05, 2005 at 10:21:38AM -0500, Adam Litke wrote:
> > > Below is a patch to implement demand faulting for huge pages. The main
> > > motivation for changing from prefaulting to demand faulting is so that
> > > huge page allocations can follow the NUMA API. Currently, huge pages
> > > are allocated round-robin from all NUMA nodes.
> >
> > I think matching DEFAULT is better than having a different default for
> > huge pages than for small pages.
>
> I am not exactly sure what the above means. Is 'DEFAULT' a system
> default numa allocation policy?

It's one of the four numa policies: DEFAULT, PREFERED, INTERLEAVE, BIND

It just means allocate on the local node if possible, otherwise fall back.

You said you wanted INTERLEAVE by default, which i think is a bad idea.
It should be only optional like in all other allocations.


> > > patch just moves the logic from hugelb_prefault() to
> > > hugetlb_pte_fault().
> >
> > Are you sure you fixed get_user_pages to handle this properly? It doesn't
> > like it.
>
> Unless I am missing something, the call to follow_hugetlb_page() in
> get_user_pages() is just an optimization. Removing it means
> follow_page() will be called individually for each PAGE_SIZE page in the
> huge page. We can probably do better but I didn't want to cloud this
> patch with that logic.

The problem is that get_user_pages needs to handle the case of a large
page not yet being faulted in properly. The SLES9 implementation did
some changes for this.

You don't change it at all, so I'm suspect it doesn't work yet.

It's a common case - think people doing raw IO on huge pages shared memory.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/