Re: [PATCH] bcache: consider the fragmentation when update the writeback rate

From: Coly Li
Date: Thu Jan 14 2021 - 05:06:50 EST


On 1/14/21 12:45 PM, Dongdong Tao wrote:
> Hi Coly,
>
> I've got the testing data for multiple threads with larger IO depth.
>

Hi Dongdong,

Thanks for the testing number.

> *Here is the testing steps:
> *1. make-bcache -B <> -C <> --writeback
>
> 2. Open two tabs, start different fio task in them at the same time.
> Tab1 run below fio command:
> sudo fio --name=random-writers --filename=/dev/bcache0 --ioengine=libaio
> --iodepth=32 --rw=randrw --blocksize=64k,8k  --direct=1 --runtime=24000
>
> Tab2 run below fio command:
> sudo fio --name=random-writers2 --filename=/dev/bcache0
> --ioengine=libaio --iodepth=8 --rw=randwrite --bs=4k --rate_iops=150
> --direct=1 --write_lat_log=rw --log_avg_msec=20
>


Why you limit the iodep to 8 and iops to 150 on cache device?
For cache device the limitation is small. Iosp 150 with 4KB block size,
it means every hour writing (150*4*60*60=2160000KB=) 2GB data. For 35
hours it is only 70GB.


What if the iodeps is 128 or 64, and no iops rate limitation ?


> Note
> - Tab1 fio will run for 24000 seconds, which is the one to cause the
> fragmentation and made the cache_available_percent drops to under 40.
> - Tab2 fio is the one that I'm capturing the latency and I have let it
> run for about 35 hours, which is long enough to allow the
> cache_available_percent drops under 30.
> - This testing method utilized fio benchmark with larger read block
> size/small write block size to cause the high fragmentation, However in
> a real production env, there could be
>    various reasons or a combination of various reasons to cause the high
> fragmentation,  but I believe it should be ok to use any method to cause
> the fragmentation to verify if
>    bcache with this patch is responding better than the master in this
> situation. 
>
> *Below is the testing result:*
>
> The total run time is about 35 hours, the latency points in the charts
> for each run are 1.5 million
>
> Master:
> fio-lat-mater.png
>
> Master + patch:
> fio-lat-patch.png
> Combine them together:
> fio-lat-mix.png
>
> Now we can see the master is even worse when we increase the iodepth,
> which makes sense since the backing HDD is being stressed more hardly.
>
> *Below are the cache stats changing during the run:*
> Master:
> bcache-stats-master.png
>
> Master + the patch:
> bcache-stats-patch.png
>
> That's all the testing done with 400GB NVME with 512B block size.
>
> Coly, do you want me to continue the same testing on 1TB nvme with
> different block size ?
> or is it ok to skip the 1TB testing and continue the test with 400GB
> NVME but with different block size? 
> feel free to let me know any other test scenarios that we should cover
> here.

Yes please, more testing is desired for performance improvement. So far
I don't see performance number for real high work load yet.

Thanks.

Coly Li