Re: [PATCH v3] iomap: add allocation cache for iomap_dio
From: Vlastimil Babka (SUSE)
Date: Tue Mar 17 2026 - 04:30:01 EST
On 3/17/26 08:28, changfengnan wrote:
>
>> That suggests in that test you used larger capacity than the automatically
>> calculated.
> The 10% improvement is due to the every cache has sheaves.
> When I tested 256-byte objects, default sheaf_capacity is 26, allocating and
> freeing 32 objects did not show a noticeable difference, but allocating and
> freeing 128 objects resulted in a significant improvement, about 3-4x in a
> multithreaded environment. about 12% improvement in single thread.
Great!
>>
>> > I'm thinking that maybe these improvements may not be significant enough to
>> > see the effect in the io flow.
>> > Using a simple list seems to be the most efficient approach.
>>
>> I think the question is, what improvement do you now see with your added
>> pcpu cache vs kmalloc() when 7.0-rc4 is used as the baseline?
>
> On 7.0-rc4, pcpu get 1.20M IOPS , kmalloc get 1.19M IOPS, new cache with set sheaf_capacity 256, 1.19M IOPS
> On 6.19, pcpu get 1.20M IOPS, kmalloc get 1.17M IOPS, new cache with set sheaf_capacity 256, 1.19M IOPS.
Thanks a lot for that data. My conclusion is that kmalloc before sheaves did
indeed worse and custom pcpu cache improved it relatively more. Kmalloc with
sheaves does better, and the improvement of custom pcpu cache is smaller.
Also the default sheaf capacity seems to be enough for this workload.
IO is not my area but getting from 1.19M to 1.20M doesn't look like it's
worth the custom code? (possibly from 1.17M to 1.20M it also wasn't).