NWFS sb->s_blocksize issues

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Thu Aug 31 2000 - 11:22:55 EST


Why was it necessary to interlock the blocksize reported by a file
system with the superblock with the page cache and the ll_rw_block()
interface? I've gotten to the bottom of the bmap() problem with using a
1024/2048/4096 blocksize and the problem is related to the way the page
cache asks for pages. NWFS will suballocate files < volume blocksize
(4K,8K,16K,32K,64K). These suballocated file runs are not always block
aligned which results in the page cache getting partial or scattered
runs.

The page cache always seems to assume that the blocks for a page will
conform to the sb->s_blocksize value (which is ok), however, what's
stilted about this model is that the underlying logic in ll_rw_block()
will enforce buffer head reads to this blocksize value. My benchmarking
shows best performance on Linux at 1024 blocksize settings (drivers
seems optimized to this case). I have found that if I define
sb->s_blocksize to a certain value, buffer heads get rejected if I
attempt other values because set_blocksize() fails if the superblock
value does not match.

I can completely get around this problem by setting everything to a
blocksize of 512, and using buffer heads set to this blocksize, however,
this eats up more memory and creates more overhead in the ll_rw_block()
code since it has to merge all these requests, plus there's the overhead
of allocating tons of buffer heads for all the 512 blocks. It also
increases (oink) the bmap() interative calling to map a file since
smaller blocksizes result in more calls through the page cache
interface. I guess what I am asking is why does there need to be a
correlation beween page cache superblock size and an artificial
enforement underneath in ll_rw_block() to force block I/O operations to
be the same.

One would think that ideally, I/O requests would allow variable length
sector operations up to any size, and a config option to allow the page
cache to bmap() sizes independently of I/O requests to the disks.

I can get around what's there, but it's wasteful of memory and runs
slower. This is probably a Linus issue.

Please Advise,

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:27 EST