Re: Page Colouring (was: 2.6.0 Huge pages not working as expected)

From: Ingo Molnar
Date: Mon Dec 29 2003 - 07:10:57 EST



* William Lee Irwin III <wli@xxxxxxxxxxxxxx> wrote:

> I also heard something about this from daniel. The description I was
> given implied rather different functionality, and raised rather
> serious questions about the implementation he didn't have adequate
> answers for. I also never saw code, despite months of occasional
> discussions about it.
>
> I did get a positive reaction from you at KS, and I've also been
> slaving away at keeping this thing current and improving it when I can
> for a year. Would you mind telling me what the Hell is going on here?
>
> I guess I already know I'm screwed beyond all hope of recovery, but I
> might as well get official confirmation.

i've been following your code (pgcl) and it looks pretty good. (it needs
finishing touches as always, but that's fine.) I tried to backport it to
2.4 before doing 4G/4G but the maintainance overhead skyrocketed and so
it not practical for 2.4-based distribution purposes - but it would be
the perfect kind of thing to start 2.7.0 with. I've not seen any other
code but yours in this area.

i believe the right approach to the 'tons of RAM' problem is to simplify
it as much as possible, ie. go for larger pages (and wrap the MMU format
in the most trivial way) and to deal with 4K pages as a filesystem (and
ELF format) compatibility thing only. Your patch does precisely this.
How much we'll have to 'mix' the two page sizes, only practice will
tell, but the less mixing, the easier it will get. Filesystems on such
systems will match the pagesize anyway.

i'd even suggest to not overdo the fractured-page logic too much - ie.
just 'waste' a full page on a misaligned or single 4K-sized vma -
concentrate on the common case: linearly mapped files and anonymous
mappings. Prefault both of them at PAGE_SIZE granularity and 'waste' the
final partial page. The VM swapout logic should only deal with full
pages. Same for the pagecache: just fill in full pages and dont worry
about granularity.

Your patch already does more than this. But i think if someone does 4K
vmas on a pgcl system or runs it on a 128 MB box and expects perfect
swapping, then it's his damn fault.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/