[PATCH v4 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation

From: John Stultz
Date: Sat Oct 17 2020 - 02:01:54 EST


Hey All,
So this is another revision of my patch series to performance
optimizations to the dma-buf system heap.

This series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked.

After this, the series introduces an optimization that
Ørjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.

Next, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION allocation code (though there still is a gap due
to ION using a mix of deferred-freeing and page pools, I'll be
looking at integrating those eventually).

Finally, a reworked version of my uncached system heap
implementation I was submitting a few weeks back. Since it
duplicated a lot of the now reworked system heap code, I
realized it would be much simpler to add the functionality to
the system_heap implementaiton itself.

While not improving the core allocation performance, the
uncached heap allocations do result in *much* improved
performance on HiKey960 as it avoids a lot of flushing and
invalidating buffers that the cpu doesn't touch often.

Feedback on these would be great!

thanks
-john

New in v4:
* Make sys_heap static (indirectly) Reported-by:
kernel test robot <lkp@xxxxxxxxx>
* Spelling fixes suggested by BrianS
* Make sys_uncached_heap static, as
Reported-by: kernel test robot <lkp@xxxxxxxxx>
* Fix wrong return value, caught by smatch
Reported-by: kernel test robot <lkp@xxxxxxxxx>
Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
* Ensure we call flush/invalidate_kernel_vmap_range() in the
uncached cases to try to address feedback about VIVT caches
from Christoph
* Reorder a few lines as suggested by BrianS
* Avoid holding the initial mapping for the lifetime of the buffer
as suggested by BrianS
* Fix a unlikely race between allocate and updating the dma_mask
that BrianS noticed.


Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx>
Cc: Liam Mark <lmark@xxxxxxxxxxxxxx>
Cc: Laura Abbott <labbott@xxxxxxxxxx>
Cc: Brian Starkey <Brian.Starkey@xxxxxxx>
Cc: Hridya Valsaraju <hridya@xxxxxxxxxx>
Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
Cc: Sandeep Patil <sspatil@xxxxxxxxxx>
Cc: Daniel Mentz <danielmentz@xxxxxxxxxx>
Cc: Chris Goldsworthy <cgoldswo@xxxxxxxxxxxxxx>
Cc: Ørjan Eide <orjan.eide@xxxxxxx>
Cc: Robin Murphy <robin.murphy@xxxxxxx>
Cc: Ezequiel Garcia <ezequiel@xxxxxxxxxxxxx>
Cc: Simon Ser <contact@xxxxxxxxxxx>
Cc: James Jones <jajones@xxxxxxxxxx>
Cc: linux-media@xxxxxxxxxxxxxxx
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx

John Stultz (7):
dma-buf: system_heap: Rework system heap to use sgtables instead of
pagelists
dma-buf: heaps: Move heap-helper logic into the cma_heap
implementation
dma-buf: heaps: Remove heap-helpers code
dma-buf: heaps: Skip sync if not mapped
dma-buf: system_heap: Allocate higher order pages if available
dma-buf: dma-heap: Keep track of the heap device struct
dma-buf: system_heap: Add a system-uncached heap re-using the system
heap

drivers/dma-buf/dma-heap.c | 33 +-
drivers/dma-buf/heaps/Makefile | 1 -
drivers/dma-buf/heaps/cma_heap.c | 327 +++++++++++++++---
drivers/dma-buf/heaps/heap-helpers.c | 270 ---------------
drivers/dma-buf/heaps/heap-helpers.h | 53 ---
drivers/dma-buf/heaps/system_heap.c | 488 ++++++++++++++++++++++++---
include/linux/dma-heap.h | 9 +
7 files changed, 749 insertions(+), 432 deletions(-)
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h

--
2.17.1