On Thu, Sep 17, 2015 at 11:19 PM, Jens Axboe <axboe@xxxxxxxxx> wrote:
On 09/17/2015 09:13 AM, Ming Lei wrote:
biovecs has become immutable since v3.13, so it isn't necessary
to allocate biovecs for the new cloned bios, then we can save
one extra biovecs allocation/copy, and the allocation is often
not fixed-length and a bit more expensive.
For example, if the 'max_sectors_kb' of null blk's queue is set
as 16(32 sectors) via sysfs just for making more splits, this patch
can increase throught about ~70% in the sequential read test over
null_blk(direct io, bs: 1M).
I'd be curious how this compares to before we did the splitting, not
exceeding the limits through bio_add_page() instead?
Let me show these test results:
----------------------------------------------------------------------------------
kernel | throught
----------------------------------------------------------------------------------
4.3.0-rc1-next-20150916 | bw=12227MB/s, iops=12227
----------------------------------------------------------------------------------
4.3.0-rc1-next-20150916 with patch | bw=21011MB/s, iops=21011
----------------------------------------------------------------------------------
v4.2 |
bw=18959MB/s, iops=18958
----------------------------------------------------------------------------------
So from the above, looks this patch is kind of fix for performance regression
introduced by 54efd50bfd(block: make generic_make_request handle
arbitrarily sized bios), :-)