Re: [PATCH] arm64: dts: ti: k3-am62x-sk-common: Reserve 128MiB of global CMA

From: Andrew Davis
Date: Mon Aug 07 2023 - 14:01:03 EST


On 8/7/23 1:03 AM, Devarsh Thakkar wrote:
Hi Nishanth,

Thanks for the review.

On 06/08/23 01:03, Nishanth Menon wrote:
On 16:44-20230803, Devarsh Thakkar wrote:
Reserve 128MiB of global CMA which is also marked as re-usable
so that OS can also use the same if peripheral drivers are not using the
same.

AM62x supports multimedia components such as GPU, dual Display and Camera.
Assuming the worst-case scenario where all 3 are run in parallel below
is the calculation :

1) OV5640 camera sensor supports 1920x1080 resolution
-> 1920 width x 1080 height x 2 bytesperpixel x 8 buffers
(default in yavta) : 32MiB

2) 1920x1200 Microtips LVDS panel supported
-> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
16 MiB

3) 1920x1080 HDMI display supported
-> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
15.82 MiB which is ~16 MiB

4) IMG GPU shares with display allocated buffers while rendering
but in case some dedicated operation viz color conversion,
keeping same window of ~16 MiB for GPU too.

Total is 80 MiB and adding 32 MiB for other peripherals and extra
16 MiB to keep as buffer for fragmentation thus rounding total to 128
MiB.

Signed-off-by: Devarsh Thakkar <devarsht@xxxxxx>
Acked-by: Darren Etheridge <detheridge@xxxxxx>
Signed-off-by: Vignesh Raghavendra <vigneshr@xxxxxx>
---

I don't think this is right approach. There are other techniques
than having to do this (Andrew: please comment) and require drivers to
behave properly.

Sorry but I did not understand clearly the disadvantage of this approach.
Here we are reserving CMA and also marking it as re-usable so that in case
driver is not using it OS can use that region.


It isn't always that easy, many types of allocations can be pinned and
cannot be placed in this region. It still has cost.

Also I see quite a few vendors already taking this approach :


Just because others have gotten away with it doesn't mean it is correct :)

There are some cases when the DMA/CMA region needs to be in a specific
location as the hardware only supports some addresses (only some address
pins wired out, etc..). But general CMA size selection is a configuration
and so has no place in DT which should only be used to describe hardware.

Another issue I have is that this forces all users of these boards to
have this rather large carveout, even if they do not intend to use all
of these IP at the same time, or even at all.

Actually, upstream we don't support GPU yet, so you can't use all of
this carveout anyway.

Lastly, large CMA carveouts as in this case are masking a bigger issue,
there is hardware IP that cannot handle scatter-gather and there is
no system level IOMMU to help with that. This simply does not scale,
fragmentation can set in even with CMA in a running system, physically
contiguous allocations can still fail. As our devices grow in complexity
while still not having an IOMMU we would need to reserve ever increasingly
sized CMA areas.

This might sound like an ad absurdum argument, but we only need to look
at our current evil vendor tree to see where this leads [0]. Yes you
are reading the size right, 1.75GB(!) of CMA..

We need a better solution upstream, I'm not claiming I know what
that solution is (probably something involved in the memory allocation
to allow for more/larger movable pages). But for the above 3 reasons
this patch is not a viable solution.

[0] https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/arch/arm64/boot/dts/ti/k3-am69-sk.dts?h=ti-linux-6.1.y#n48

Andrew

$grep -r cma-default arch/arm64/boot/dts
arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts:276:
linux,cma-default;
arch/arm64/boot/dts/qcom/sc8280xp-crd.dts:222:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mp-tqma8mpql-mba8mpxl.dts:201:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8ulp-evk.dts:32:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8-apalis-v1.1.dtsi:198:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mm-tqma8mqml.dtsi:48:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8dxl-evk.dts:50:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx93-tqma9352.dtsi:24:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mq-tqma8mq.dtsi:59:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx8mn-tqma8mqnl.dtsi:46:
linux,cma-default;
arch/arm64/boot/dts/freescale/imx93-11x11-evk.dts:28:
linux,cma-default;
arch/arm64/boot/dts/hisilicon/hi6220-hikey.dts:67:
linux,cma-default;
arch/arm64/boot/dts/amlogic/meson-gx.dtsi:63:
linux,cma-default;
arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi:121:
linux,cma-default;
arch/arm64/boot/dts/amlogic/meson-a1.dtsi:59:
linux,cma-default;


I am esp concerned since there are platforms based on
am62x and just 256MB DDR.


The file "k3-am62x-sk-common.dtsi" refers DDR memory size as 2Gb[1] and so I
put CMA reservation in same file assuming all boards including this file have
2Gb.

But if there are some boards having lesser DDR and including this
k3-am62x-sk-common.dtsi and overriding memory node, I can put the CMA
reservation node in board specific file i.e. k3-am625-sk.dts in V2.

Kindly let me know if above is preferred approach.

[1]
https://gitlab.com/linux-kernel/linux-next/-/blob/next-20230807/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi?ref_type=tags#L33

Regards
Devarsh

arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
index 34c8ffc553ec..9dd6e23ca9ca 100644
--- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
@@ -47,6 +47,14 @@ ramoops@9ca00000 {
pmsg-size = <0x8000>;
};
+ /* global cma region */
+ linux,cma {
+ compatible = "shared-dma-pool";
+ reusable;
+ size = <0x00 0x8000000>;
+ linux,cma-default;
+ };
+
secure_tfa_ddr: tfa@9e780000 {
reg = <0x00 0x9e780000 0x00 0x80000>;
alignment = <0x1000>;
--
2.34.1