Re: [dm-devel] [PATCH 0/2] Introduce the request handling for dm-crypt

From: Baolin Wang
Date: Thu Dec 03 2015 - 05:36:55 EST


On 3 December 2015 at 10:56, Baolin Wang <baolin.wang@xxxxxxxxxx> wrote:
> On 3 December 2015 at 03:56, Alasdair G Kergon <agk@xxxxxxxxxx> wrote:
>> On Wed, Dec 02, 2015 at 08:46:54PM +0800, Baolin Wang wrote:
>>> These are the benchmarks for request based dm-crypt. Please check it.
>>
>> Now please put request-based dm-crypt completely to one side and focus
>> just on the existing bio-based code. Why is it slower and what can be
>> adjusted to improve this?
>>
>
> OK. I think I find something need to be point out.
> 1. From the IO block size test in the performance report, for the
> request based, we can find it can not get the corresponding
> performance if we just expand the IO size. Because In dm crypt, it
> will map the data buffer of one request with scatterlists, and send
> all scatterlists of one request to the encryption engine to encrypt or
> decrypt. I found if the scatterlist list number is small and each
> scatterlist length is bigger, it will improve the encryption speed,
> that helps the engine palys best performance. But a big IO size does
> not mean bigger scatterlists (maybe many scatterlists with small
> length), that's why we can not get the corresponding performance if we
> just expand the IO size I think.
>
> 2. Why bio based is slower?
> If you understand 1, you can obviously understand the crypto engine
> likes bigger scatterlists to improve the performance. But for bio
> based, it only send one scatterlist (the scatterlist's length is
> always '1 << SECTOR_SHIFT' = 512) to the crypto engine at one time. It
> means if the bio size is 1M, the bio based will send 2048 times (evey
> time the only one scatterlist length is 512 bytes) to crypto engine to
> handle, which is more time-consuming and ineffective for the crypto
> engine. But for request based, it can map the whole request with many
> scatterlists (not just one scatterlist), and send all the scatterlists
> to the crypto engine which can improve the performance, is it right?
>
> Another optimization solution I think is we can expand the scatterlist
> entry number for bio based.
>

I did some testing about my assumption of expanding the scatterlist
entry number for bio based. I did some modification for the bio based
to support multiple scatterlists, then it will get the same
performance as the request based things.

1. bio based with expanding the scatterlist entry
time dd if=/dev/dm-0 of=/dev/null bs=64K count=16384 iflag=direct
1073741824 bytes (1.1 GB) copied, 94.5458 s, 11.4 MB/s
real 1m34.562s
user 0m0.030s
sys 0m3.850s

2. Sequential read 1G with requset based:
time dd if=/dev/dm-0 of=/dev/null bs=64K count=16384 iflag=direct
1073741824 bytes (1.1 GB) copied, 94.8922 s, 11.3 MB/s
real 1m34.908s
user 0m0.030s
sys 0m4.000s

>From the data, we can find the bio based also can get the same
performance as the request based. So if someone still don't like the
request based things, I think we can optimize the bio based by
expanding the scatterlists number. Thanks.

>>
>> Alasdair
>>
>
>
>
> --
> Baolin.wang
> Best Regards



--
Baolin.wang
Best Regards
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/