Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE)

From: David Chinner
Date: Fri Jul 13 2007 - 03:13:32 EST


On Thu, Jul 12, 2007 at 09:34:57AM -0700, Dave Hansen wrote:
> On Thu, 2007-07-12 at 18:31 +0200, Andrea Arcangeli wrote:
> > On Fri, Jul 13, 2007 at 12:44:49AM +1000, David Chinner wrote:
> > > That's crap. Just because a machine has lots of memory does not
> > > make it OK to waste lots of memory.
> >
> > It's not just wasted, it lowers overhead all over the place. Yes, the
> > benefit of wasting less pagecache may largely outweight the benefit of
> > having a larger page size, but if you've a lot of memory perhaps your
> > working set already fits in the cache, or perhaps you don't fit in the
> > cache regardless of the page size.
>
> Have you guys seen Shaggy's page cache tails?
>
> http://kernel.org/pub/linux/kernel/people/shaggy/OLS-2006/kleikamp.pdf
>
> We've had the same memory waste issue on ppc64 with 64k hardware
> pages.

Sure. Fundamentally, though, I think it is the wrong approach to
take - it's a workaround for a big negative side effect of
increasing page size. It introduces lots of complexity and
difficult-to-test corner cases; judging by the tail packing problems
reiser3 has had over the years, it has the potential to be a
never-ending source of data corruption bugs.

I think that fine granularity and aggregation for efficiency of
scale is a better model to use than increasing the base page size.
With PPC, you can handle different page sizes in the hardware (like
MIPS) and the use of 64k base page size is an obvious workaround to
the problem of not being able to use multiple page sizes within the
OS.

Adding a workaround (tail packing) to address the negative side
effects of another workaround (64k base page size) ignores the basic
problem that has led to both these things being done: Linux does not
support multiple page sizes natively.....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/