Re: [PATCH net] xsk: remove cheap_dma optimization

From: BjÃrn TÃpel
Date: Wed Jul 01 2020 - 06:18:03 EST


On 2020-06-29 17:41, Robin Murphy wrote:
On 2020-06-28 18:16, BjÃrn TÃpel wrote:
[...]>
Somewhat related to the DMA API; It would have performance benefits for
AF_XDP if the DMA range of the mapped memory was linear, i.e. by IOMMU
utilization. I've started hacking a thing a little bit, but it would be
nice if such API was part of the mapping core.

Input: array of pages Output: array of dma addrs (and obviously dev,
flags and such)

For non-IOMMU len(array of pages) == len(array of dma addrs)
For best-case IOMMU len(array of dma addrs) == 1 (large linear space)

But that's for later. :-)

FWIW you will typically get that behaviour from IOMMU-based implementations of dma_map_sg() right now, although it's not strictly guaranteed. If you can weather some additional setup cost of calling sg_alloc_table_from_pages() plus walking the list after mapping to test whether you did get a contiguous result, you could start taking advantage of it as some of the dma-buf code in DRM and v4l2 does already (although those cases actually treat it as a strict dependency rather than an optimisation).

I'm inclined to agree that if we're going to see more of these cases, a new API call that did formally guarantee a DMA-contiguous mapping (either via IOMMU or bounce buffering) or failure might indeed be handy.


I forgot to reply to this one! My current hack is using the iommu code directly, similar to what vfio-pci does (hopefully not gutting the API this time ;-)).

Your approach sound much nicer, and easier. I'll try that out! Thanks a lot for the pointers, and I might be back with more questions.


Cheers,
BjÃrn

Robin.