Re: [PATCH 0/6] dma mapping/iommu: Allow IOMMU IOVA rcache range to be configured

From: John Garry
Date: Fri Mar 19 2021 - 11:44:59 EST


On 19/03/2021 13:40, Christoph Hellwig wrote:
On Fri, Mar 19, 2021 at 09:25:42PM +0800, John Garry wrote:
For streaming DMA mappings involving an IOMMU and whose IOVA len regularly
exceeds the IOVA rcache upper limit (meaning that they are not cached),
performance can be reduced.

This is much more pronounced from commit 4e89dce72521 ("iommu/iova: Retry
from last rb tree node if iova search fails"), as discussed at [0].

IOVAs which cannot be cached are highly involved in the IOVA aging issue,
as discussed at [1].

I'm confused. If this a limit in the IOVA allocator, dma-iommu should
be able to just not grow the allocation so larger without help from
the driver.

This is not an issue with the IOVA allocator.

The issue is with how the IOVA code handles caching of IOVAs. Specifically, when we DMA unmap, for an IOVA whose length is above a fixed threshold, the IOVA is freed, rather than being cached. See free_iova_fast().

For performance reasons, I want that threshold increased for my driver to avail of the caching of all lengths of IOVA which we may see - currently we see IOVAs whose length exceeds that threshold. But it may not be good to increase that threshold for everyone.

> If contrary to the above description it is device-specific, the driver
> could simply use dma_get_max_seg_size().
> .
>

But that is for a single segment, right? Is there something equivalent to tell how many scatter-gather elements which we may generate, like scsi_host_template.sg_tablesize?

Thanks,
John