Re: [00/17] Large Blocksize Support V3

From: Maxim Levitsky
Date: Sat Apr 28 2007 - 12:41:01 EST

On Wednesday 25 April 2007 01:21, clameter@xxxxxxx wrote:

> Rationales:
> 1. We have problems supporting devices with a higher blocksize than
> page size. This is for example important to support CD and DVDs that
> can only read and write 32k or 64k blocks. We currently have a shim
> layer in there to deal with this situation which limits the speed
> of I/O. The developers are currently looking for ways to completely
> bypass the page cache because of this deficiency.
> 2. 32/64k blocksize is also used in flash devices. Same issues.
> 3. Future harddisks will support bigger block sizes that Linux cannot
> support since we are limited to PAGE_SIZE. Ok the on board cache
> may buffer this for us but what is the point of handling smaller
> page sizes than what the drive supports?
> 4. Reduce fsck times. Larger block sizes mean faster file system checking.
> 5. Performance. If we look at IA64 vs. x86_64 then it seems that the
> faster interrupt handling on x86_64 compensate for the speed loss due to
> a smaller page size (4k vs 16k on IA64). Supporting larger block sizes
> sizes on all allows a significant reduction in I/O overhead and
> increases the size of I/O that can be performed by hardware in a single
> request since the number of scatter gather entries are typically limited
> for one request. This is going to become increasingly important to support
> the ever growing memory sizes since we may have to handle excessively large
> amounts of 4k requests for data sizes that may become common soon. For
> example to write a 1 terabyte file the kernel would have to handle 256
> million 4k chunks.
> 6. Cross arch compatibility: It is currently not possible to mount
> an 16k blocksize ext2 filesystem created on IA64 on an x86_64 system.
> With this patch this becoems possible.


I have a few questions about that patchset:

1) Is it possible for block device to assume that it will alway get big
requests (and aligned by big blocksize) ?

2) Does metadata reading/writing occuress also using same big blocksize ?

3 If so, How __bread/__getblk are affrected? Does returned buffer_head point
to whole block ?

And what do you think about mine design ?
I want to link parts of compound page through buffer_heads
So the head page's bh points to second page (tail page ) bh's, and from this
bh it is possible to reference the page itself and so on.
(This will allow a compound page be physicly fragmented)

Best regards,
Maxim Levitsky


I ask questions since this patchset does matter to me, I really like to see
this <= 4K limit lifted (all software limits are bad)
And finaly get good packet writing... I miss DirectCD much...
Altough >4K blocksizes are really only first step.
To make really fast packet writing, the UDF filesystem should be rewritten as
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at