On Fri, 23 Mar 2007 01:44:38 -0700 "Ken Chen" <kenchen@xxxxxxxxxx> wrote:
> On 3/21/07, Adam Litke <agl@xxxxxxxxxx> wrote:
> > The main reason I am advocating a set of pagetable_operations is to
> > enable the development of a new hugetlb interface. During the hugetlb
> > BOFS at OLS last year, we talked about a character device that would
> > behave like /dev/zero. Many of the people were talking about how they
> > just wanted to create MAP_PRIVATE hugetlb mappings without all the fuss
> > about the hugetlbfs filesystem. /dev/zero is a familiar interface for
> > getting anonymous memory so bringing that model to huge pages would make
> > programming for anonymous huge pages easier.
>
> I think we have enough infrastructure currently in hugetlbfs to
> implement what Adam wants for something like a /dev/hugetlb char
> device (except we can't afford to have a zero hugetlb page since it
> will be too costly on some arch).
>
> I really like the idea of having something similar to /dev/zero for
> hugetlb page. So I coded it up on top of existing hugetlbfs. The
> core change is really small and half of the patch is really just
> moving things around. I think this at least can partially fulfill the
> goal.
Standing back and looking at this...
afaict the whole reason for this work is to provide a quick-n-easy way to
get private mappings of hugetlb pages. With the emphasis on quick-n-easy.
We can do the same with hugetlbfs, but that involves (horror) "fuss".
The way to avoid "fuss" is of course to do it once, do it properly then stick
it in a library which everyone uses.
But libraries are hard, for a number of distributional reasons. It is
easier for us to distribute this functionality within the kernel. In fact,
if Linus's tree included a ./userspace/libkernel/libhugetlb/ then we'd
probably provide this functionality in there.
This comes up regularly, and it's pretty sad.
Probably the kernel team should be maintaining, via existing processes, a
separate libkernel project, to fix these distributional problems. The
advantage in this case is of course that our new hugetlb functionality
would be available to people on 2.6.18 kernels, not only on 2.6.22 and
later.
Am I wrong?