[PATCH v2] dma-buf: system_heap: avoid reclaim for order 4
From: Jaewon Kim
Date: Thu Feb 09 2023 - 23:57:06 EST
Using order 4 pages would be helpful for IOMMUs mapping, but trying to
get order 4 pages could spend quite much time in the page allocation.
>From the perspective of responsiveness, the deterministic memory
allocation speed, I think, is quite important.
The order 4 allocation with __GFP_RECLAIM may spend much time in
reclaim and compation logic. __GFP_NORETRY also may affect. These cause
unpredictable delay.
To get reasonable allocation speed from dma-buf system heap, use
HIGH_ORDER_GFP for order 4 to avoid reclaim. And let me remove
meaningless __GFP_COMP for order 0.
According to my tests, order 4 with MID_ORDER_GFP could get more number
of order 4 pages but the elapsed times could be very slow.
time order 8 order 4 order 0
584 usec 0 160 0
28,428 usec 0 160 0
100,701 usec 0 160 0
76,645 usec 0 160 0
25,522 usec 0 160 0
38,798 usec 0 160 0
89,012 usec 0 160 0
23,015 usec 0 160 0
73,360 usec 0 160 0
76,953 usec 0 160 0
31,492 usec 0 160 0
75,889 usec 0 160 0
84,551 usec 0 160 0
84,352 usec 0 160 0
57,103 usec 0 160 0
93,452 usec 0 160 0
If HIGH_ORDER_GFP is used for order 4, the number of order 4 could be
decreased but the elapsed time results were quite stable and fast
enough.
time order 8 order 4 order 0
1,356 usec 0 155 80
1,901 usec 0 11 2384
1,912 usec 0 0 2560
1,911 usec 0 0 2560
1,884 usec 0 0 2560
1,577 usec 0 0 2560
1,366 usec 0 0 2560
1,711 usec 0 0 2560
1,635 usec 0 28 2112
544 usec 10 0 0
633 usec 2 128 0
848 usec 0 160 0
729 usec 0 160 0
1,000 usec 0 160 0
1,358 usec 0 160 0
2,638 usec 0 31 2064
Signed-off-by: Jaewon Kim <jaewon31.kim@xxxxxxxxxxx>
---
drivers/dma-buf/heaps/system_heap.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index e8bd10e60998..920db302a273 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -41,12 +41,11 @@ struct dma_heap_attachment {
bool mapped;
};
-#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
-#define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN)
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO)
#define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
| __GFP_NORETRY) & ~__GFP_RECLAIM) \
| __GFP_COMP)
-static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP};
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, HIGH_ORDER_GFP, LOW_ORDER_GFP};
/*
* The selection of the orders used for allocation (1MB, 64K, 4K) is designed
* to match with the sizes often found in IOMMUs. Using order 4 pages instead
--
2.17.1