Re: RFC: CONFIG_PAGE_SHIFT (aka software PAGE_SIZE)

From: Andrea Arcangeli
Date: Thu Jul 12 2007 - 07:14:59 EST


On Thu, Jul 12, 2007 at 10:12:56AM +1000, David Chinner wrote:
> I need really large filesystems that contain both small and large files to
> work more efficiently on small boxes where we can't throw endless amounts of
> RAM and CPUs at the problem. Hence things like 64k page size are just not an
> option because of the wastage that it entails.

I didn't know you were allocating 4k pages for the small files and 64k
for the large ones in the same fs. That sounds quite a bit
overkill. So it seems all you really need is to reduce the length of
the sg list? Otherwise you could do the above fine without order > 0
+ pte changes and memcpy in the defrag code. Given the amount of cpu
you throw at the problem of deciding 4k or 64k pages and the defrag,
and all complexity involved to handle mixed page-cache-sized per
inode, I doubt the cpu saving of the order page size matters much to
you. Probably the main thing you can measure is your storage subsystem
being too slow if the DMA isn't physically contiguous, hence the need
for those larger pages when you do I/O on the big files.

I still think you should run those systems with PAGE_SIZE 64k even if
it'll waste you more memory on the small files.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/