Re: [00/17] Large Blocksize Support V3

From: Andrew Morton
Date: Thu Apr 26 2007 - 22:05:55 EST

On Tue, 24 Apr 2007 15:21:05 -0700 clameter@xxxxxxx wrote:

> This patchset modifies the Linux kernel so that larger block sizes than
> page size can be supported. Larger block sizes are handled by using
> compound pages of an arbitrary order for the page cache instead of
> single pages with order 0.

Something I was looking for but couldn't find: suppose an application takes
a pagefault against the third 4k page of an order-2 pagecache "page". We
need to instantiate a pte against find_get_page(offset/4)+3. But these
patches don't touch mm/memory.c at all and filemap_nopage() appears to
return the zeroeth 4k page all the time in that case.

So.. what am I missing, and how does that part work?

Also, afaict your important requirements would be met by retaining
PAGE_CACHE_SIZE=4k and simply ensuring that pagecache is populated by
physically contiguous pages - so instead of allocating and adding one 4k
page, we allocate an order-2 page and sprinkle all four page*'s into the
radix tree in one hit. That should be fairly straightforward to do, and
could be made indistinguishably fast from doing a single 16k page for some
common pagecache operations (gang-insert, gang-lookup).

The BIO and block layers will do-the-right-thing with that pagecache and
you end up with four times more data in the SG lists, worst-case.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at