Re: [PATCH] bcache: consider the fragmentation when update the writeback rate

From: Coly Li
Date: Thu Jan 14 2021 - 05:06:50 EST

Next message: Borislav Petkov: "Re: linux-next: build failure after merge of the tip tree"
Previous message: Enrico Weigelt, metux IT consult: "[PATCH] scripts: kconfig: fix HOSTCC call"
In reply to: Dongdong Tao: "Re: [PATCH] bcache: consider the fragmentation when update the writeback rate"
Next in thread: Dongdong Tao: "Re: [PATCH] bcache: consider the fragmentation when update the writeback rate"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/14/21 12:45 PM, Dongdong Tao wrote:
> Hi Coly,
>
> I've got the testing data for multiple threads with larger IO depth.
>

Hi Dongdong,

Thanks for the testing number.

> *Here is the testing steps:
> *1. make-bcache -B <> -C <> --writeback
>
> 2. Open two tabs, start different fio task in them at the same time.
> Tab1 run below fio command:
> sudo fio --name=random-writers --filename=/dev/bcache0 --ioengine=libaio
> --iodepth=32 --rw=randrw --blocksize=64k,8k --direct=1 --runtime=24000
>
> Tab2 run below fio command:
> sudo fio --name=random-writers2 --filename=/dev/bcache0
> --ioengine=libaio --iodepth=8 --rw=randwrite --bs=4k --rate_iops=150
> --direct=1 --write_lat_log=rw --log_avg_msec=20
>

Why you limit the iodep to 8 and iops to 150 on cache device?
For cache device the limitation is small. Iosp 150 with 4KB block size,
it means every hour writing (150*4*60*60=2160000KB=) 2GB data. For 35
hours it is only 70GB.

What if the iodeps is 128 or 64, and no iops rate limitation ?

> Note
> - Tab1 fio will run for 24000 seconds, which is the one to cause the
> fragmentation and made the cache_available_percent drops to under 40.
> - Tab2 fio is the one that I'm capturing the latency and I have let it
> run for about 35 hours, which is long enough to allow the
> cache_available_percent drops under 30.
> - This testing method utilized fio benchmark with larger read block
> size/small write block size to cause the high fragmentation, However in
> a real production env, there could be
> various reasons or a combination of various reasons to cause the high
> fragmentation, but I believe it should be ok to use any method to cause
> the fragmentation to verify if
> bcache with this patch is responding better than the master in this
> situation.
>
> *Below is the testing result:*
>
> The total run time is about 35 hours, the latency points in the charts
> for each run are 1.5 million
>
> Master:
> fio-lat-mater.png
>
> Master + patch:
> fio-lat-patch.png
> Combine them together:
> fio-lat-mix.png
>
> Now we can see the master is even worse when we increase the iodepth,
> which makes sense since the backing HDD is being stressed more hardly.
>
> *Below are the cache stats changing during the run:*
> Master:
> bcache-stats-master.png
>
> Master + the patch:
> bcache-stats-patch.png
>
> That's all the testing done with 400GB NVME with 512B block size.
>
> Coly, do you want me to continue the same testing on 1TB nvme with
> different block size ?
> or is it ok to skip the 1TB testing and continue the test with 400GB
> NVME but with different block size?
> feel free to let me know any other test scenarios that we should cover
> here.

Yes please, more testing is desired for performance improvement. So far
I don't see performance number for real high work load yet.

Thanks.

Coly Li

Next message: Borislav Petkov: "Re: linux-next: build failure after merge of the tip tree"
Previous message: Enrico Weigelt, metux IT consult: "[PATCH] scripts: kconfig: fix HOSTCC call"
In reply to: Dongdong Tao: "Re: [PATCH] bcache: consider the fragmentation when update the writeback rate"
Next in thread: Dongdong Tao: "Re: [PATCH] bcache: consider the fragmentation when update the writeback rate"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]