Re: [PATCHv5 06/14] remoteproc/omap: Initialize and assign reserved memory node

From: Tero Kristo
Date: Wed Feb 05 2020 - 03:51:46 EST


On 31/01/2020 01:04, Andrew F. Davis wrote:
On 1/30/20 5:06 PM, Suman Anna wrote:
On 1/30/20 3:57 PM, Suman Anna wrote:
On 1/30/20 3:50 PM, Andrew F. Davis wrote:
On 1/30/20 4:39 PM, Suman Anna wrote:
On 1/30/20 3:19 PM, Andrew F. Davis wrote:
On 1/30/20 3:39 PM, Suman Anna wrote:
On 1/30/20 2:22 PM, Andrew F. Davis wrote:
On 1/30/20 2:55 PM, Suman Anna wrote:
On 1/30/20 1:42 PM, Tero Kristo wrote:
On 30/01/2020 21:20, Andrew F. Davis wrote:
On 1/30/20 2:18 PM, Tero Kristo wrote:
On 30/01/2020 20:11, Andrew F. Davis wrote:
On 1/16/20 8:53 AM, Tero Kristo wrote:
From: Suman Anna <s-anna@xxxxxx>

The reserved memory nodes are not assigned to platform devices by
default in the driver core to avoid the lookup for every platform
device and incur a penalty as the real users are expected to be
only a few devices.

OMAP remoteproc devices fall into the above category and the OMAP
remoteproc driver _requires_ specific CMA pools to be assigned
for each device at the moment to align on the location of the
vrings and vring buffers in the RTOS-side firmware images. So,


Same comment as before, this is a firmware issue for only some
firmwares
that do not handle being assigned vring locations correctly and instead
hard-code them.

As for this statement, this can do with some updating. Post 4.20,
because of the lazy allocation scheme used for carveouts including the
vrings, the resource tables now have to use FW_RSC_ADDR_ANY and will
have to wait for the vdev synchronization to happen.


I believe we discussed this topic in length in previous version but
there was no conclusion on it.

The commit desc might be a bit misleading, we are not actually forced to
use specific CMA buffers, as we use IOMMU to map these to device
addresses. For example IPU1/IPU2 use internally exact same memory
addresses, iommu is used to map these to specific CMA buffer.

CMA buffers are mostly used so that we get aligned large chunk of memory
which can be mapped properly with the limited IOMMU OMAP family of chips
have. Not sure if there is any sane way to get this done in any other
manner.



Why not use the default CMA area?

I think using default CMA area getting the actual memory block is not
guaranteed and might fail. There are other users for the memory, and it
might get fragmented at the very late phase we are grabbing the memory
(omap remoteproc driver probe time.) Some chunks we need are pretty large.

I believe I could experiment with this a bit though and see, or Suman
could maybe provide feedback why this was designed initially like this
and why this would not be a good idea.

I have given some explanation on this on v4 as well, but if it is not
clear, there are restrictions with using default CMA. Default CMA has
switched to be assigned from the top of the memory (higher addresses,
since 3.18 IIRC), and the MMUs on IPUs and DSPs can only address
32-bits. So, we cannot blindly use the default CMA pool, and this will
definitely not work on boards > 2 GB RAM. And, if you want to add in any
firewall capability, then specific physical addresses becomes mandatory.



If you need 32bit range allocations then
dma_set_mask(dev, DMA_BIT_MASK(32));

I'm not saying don't have support for carveouts, just make them
optional, keystone_remoteproc.c does this:

if (of_reserved_mem_device_init(dev))
dev_warn(dev, "device does not have specific CMA pool\n");

There doesn't even needs to be a warning but that is up to you.

It is not exactly an apples to apples comparison. K2s do not have MMUs,
and most of our firmware images on K2 are actually running out of the
DSP internal memory.



So again we circle back to it being a firmware issue, if K2 can get away
without needing carveouts and it doesn't even have an MMU then certainly
OMAP/DRA7x class devices can handle it even better given they *do* have
an IOMMU. Unless someone is hard-coding the IOMMU configuration.. In
which case we are still just hacking around the problem here with
mandatory specific address memory carveouts.

Optional carveouts on OMAP remoteprocs can be an enhancement in the
future, but at the moment, we won't be able to run use-cases without
this. And I have already given some of the reasons for the same here and
on v4.



No reason to be dismissive, my questions are valid.

What "use-cases" are we talking about, I have firmware that doesn't need
specific carved-out addresses.

I think you are well aware of all the usecases we provide with the TI
SDKs with IPUs and DSPs. And what is the firmware that you have and what
do you use it for?



Yes I know exactly the pieces of TI firmware we are talking about and
why it they still need carveouts. That's not the point, our firmware may
have issues and hard-coding, but we need to allow for correctly built
firmware that doesn't need carveouts also. This driver should not fail
if a carveout is not provided. The remoteproc can run fine without a
carveout, only some firmwares cannot, so it should be optional and not
forced in DT on everyone using our DSP/IPU.


If you have misbehaving firmware that
needs statically carved out memory addresses then you can have carveouts
if you want, but it should be optional.
If I don't want to pollute my
system's memory space with a bunch of carveout holes then I shouldn't
have to just because your specific firmware needs them.

Further follow-up series like early-boot and late-attach will mandate
fixed carveouts actually. You cannot just run out of any random memory.



Those are different, the location of the loaded firmware in memory will
need to be carved out if it is in use by a remote core before Linux
boots. This carveout is for Linux to allocate from to load the Vrings
and other memory it may need. When late-attach shows up then we can
think about how to handle those.


Also, these are CMA pools ("reusable"), so they are not actual carveout
holes ("no-map"). This is the preferred method in remoteproc mode so
that the memory is available for kernel when remoteprocs are not in use.
Customers can always choose to make these carveouts so that they do not
run into memory allocation issues when changing firmwares and under
stress conditions. These will have to be carveouts for early-boot usecases.



Even "reusable" carveouts can only be used by re-locateable memory
(caches and such) so still not a good thing to have your memory space
full of them.

Customers can always choose to make these carveouts

That is exactly what I am saying, they can choose, but it should be
optional, the current binding and driver make them mandatory or the
driver will not probe.

Ok, so just to re-cap here, we (me + Suman + Andrew) aligned offline on this and the plan from our side is to:

- Make the memory-regions part of the binding optional, not mandatory
- Make the omap remoteproc driver portion for the memory-regions also optional, not fail to probe
- Print a warning from the omap driver if memory regions are not in place, claiming user knows what he is doing, as right now nothing works without the memory regions defined.

I'll post v6 against -rc1 with these changes once it is available.

-Tero



Andrew


regards
Suman


regards
Suman


Andrew


regards
Suman


Andrew


regards
Suman


Andrew


regards
Suman


-Tero


Andrew


-Tero


This is not a requirement of the remote processor itself and so it
should not fail to probe if a specific memory carveout isn't given.


use the of_reserved_mem_device_init/release() API appropriately
to assign the corresponding reserved memory region to the OMAP
remoteproc device. Note that only one region per device is
allowed by the framework.

Signed-off-by: Suman Anna <s-anna@xxxxxx>
Signed-off-by: Tero Kristo <t-kristo@xxxxxx>
Reviewed-by: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>
---
v5: no changes

ÂÂ drivers/remoteproc/omap_remoteproc.c | 12 +++++++++++-
ÂÂ 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/remoteproc/omap_remoteproc.c
b/drivers/remoteproc/omap_remoteproc.c
index 0846839b2c97..194303b860b2 100644
--- a/drivers/remoteproc/omap_remoteproc.c
+++ b/drivers/remoteproc/omap_remoteproc.c
@@ -17,6 +17,7 @@
ÂÂ #include <linux/module.h>
ÂÂ #include <linux/err.h>
ÂÂ #include <linux/of_device.h>
+#include <linux/of_reserved_mem.h>
ÂÂ #include <linux/platform_device.h>
ÂÂ #include <linux/dma-mapping.h>
ÂÂ #include <linux/remoteproc.h>
@@ -480,14 +481,22 @@ static int omap_rproc_probe(struct
platform_device *pdev)
ÂÂÂÂÂÂ if (ret)
ÂÂÂÂÂÂÂÂÂÂ goto free_rproc;
ÂÂ +ÂÂÂ ret = of_reserved_mem_device_init(&pdev->dev);
+ÂÂÂ if (ret) {
+ÂÂÂÂÂÂÂ dev_err(&pdev->dev, "device does not have specific CMA
pool\n");
+ÂÂÂÂÂÂÂ goto free_rproc;
+ÂÂÂ }
+
ÂÂÂÂÂÂ platform_set_drvdata(pdev, rproc);
ÂÂ ÂÂÂÂÂ ret = rproc_add(rproc);
ÂÂÂÂÂÂ if (ret)
-ÂÂÂÂÂÂÂ goto free_rproc;
+ÂÂÂÂÂÂÂ goto release_mem;
ÂÂ ÂÂÂÂÂ return 0;
ÂÂ +release_mem:
+ÂÂÂ of_reserved_mem_device_release(&pdev->dev);
ÂÂ free_rproc:
ÂÂÂÂÂÂ rproc_free(rproc);
ÂÂÂÂÂÂ return ret;
@@ -499,6 +508,7 @@ static int omap_rproc_remove(struct
platform_device *pdev)
ÂÂ ÂÂÂÂÂ rproc_del(rproc);
ÂÂÂÂÂÂ rproc_free(rproc);
+ÂÂÂ of_reserved_mem_device_release(&pdev->dev);
ÂÂ ÂÂÂÂÂ return 0;
ÂÂ }


--

--






--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki