On Mon, Apr 29, 2019 at 12:06:52PM +0100, Robin Murphy wrote:
From the reply up-thread I guess you're trying to include an optimisation
to only copy the head and tail of the buffer if it spans multiple pages,
and directly map the ones in the middle, but AFAICS that's going to tie you
to also using strict mode for TLB maintenance, which may not be a win
overall depending on the balance between invalidation bandwidth vs. memcpy
bandwidth. At least if we use standard SWIOTLB logic to always copy the
whole thing, we should be able to release the bounce pages via the flush
queue to allow 'safe' lazy unmaps.
Oh. The head and tail optimization is what I missed. Yes, for that
we'd need the offset.
Either way I think it would be worth just implementing the straightforward
version first, then coming back to consider optimisations later.
Agreed, let's start simple. Especially as large DMA mappings or
allocations should usually be properly aligned anyway, and if not we
should fix that for multiple reasons.