Re: [patch] do_no_pfn handler

From: Carsten Otte
Date: Tue Apr 11 2006 - 16:03:58 EST


Linus Torvalds wrote:
> You _really_ cannot do COW together with "random pfn filling".
I still have'nt found a good way to do so, even after discussing with Nick and Hugh, but that's exactly where I intend to get for the xip stuff.

Today, the _only_ code that uses the struct page behind those DCSS segments is aops->nopage (as return value) and do_wp_page. Those small servers have almost no local memory (kernel, libraries, and binaries are shared), and the mem_map array is a large overhead.

> You can do COW with a pure remap_pfn_range() (ie a /dev/mem kind of
> mapping, or a frame buffer etc), but that's only because it has a very
> magic special case that is used to distinguish between cow'ed pages and
> the pages that were inserted initially.
>
> We have no free bits in the page tables to say "this is a COW page" in
> general (on x86 we could do it, but some other architectures don't have
> any SW-usable bits).
That's true. One can store that information in the vma flags, and split the vma into 3 vmas once we have a write fault. Although that would work in theory, I doubt it would save lot of memory because of too many vmas, and I think we would burn precious CPU horsepower walking all those vmas.
I believe Hugh has already done an implementation for that which he does not consider nice. I have not found a feasible way to adress that issue so far, and I promise to keep from coding until I find a reasonable non-intrusive way to get there.
--

Carsten Otte
IBM Linux technology center
ARCH=s390
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/