RE: [PATCH v18 4/5] dt-bindings: remoteproc: Add documentation for ZynqMP R5 rproc bindings

From: Ben Levinsky
Date: Thu Oct 08 2020 - 10:21:31 EST


Hi Linus,

Thanks for the review

Please see my comments inline

>
> Hi Ben,
>
> thanks for your patch! I noticed this today and pay some interest
> because in the past I used with implementing the support for
> TCM memory on ARM32.
>
> On Mon, Oct 5, 2020 at 6:06 PM Ben Levinsky <ben.levinsky@xxxxxxxxxx>
> wrote:
>
> > Add binding for ZynqMP R5 OpenAMP.
> >
> > Represent the RPU domain resources in one device node. Each RPU
> > processor is a subnode of the top RPU domain node.
> >
> > Signed-off-by: Jason Wu <j.wu@xxxxxxxxxx>
> > Signed-off-by: Wendy Liang <jliang@xxxxxxxxxx>
> > Signed-off-by: Michal Simek <michal.simek@xxxxxxxxxx>
> > Signed-off-by: Ben Levinsky <ben.levinsky@xxxxxxxxxx>
> (...)
>
> > +title: Xilinx R5 remote processor controller bindings
> > +
> > +description:
> > + This document defines the binding for the remoteproc component that
> loads and
> > + boots firmwares on the Xilinx Zynqmp and Versal family chipset.
>
> ... firmwares for the on-board Cortex R5 of the Zynqmp .. (etc)
>
Will fix
> > +
> > + Note that the Linux has global addressing view of the R5-related memory
> (TCM)
> > + so the absolute address ranges are provided in TCM reg's.
>
> Please do not refer to Linux in bindings, they are also for other
> operating systems.
>
> Isn't that spelled out "Tightly Coupled Memory" (please expand the acronym).
>
> I had a hard time to parse this description, do you mean:
>
> "The Tightly Coupled Memory (an on-chip SRAM) used by the Cortex R5
> is double-ported and visible in both the physical memory space of the
> Cortex A5 and the memory space of the main ZynqMP processor
> cluster. This is visible in the address space of the ZynqMP processor
> at the address indicated here."
>
> That would make sense, but please confirm/update.
>

Yes the TCM address space as noted in the TCM device tree nodes (e.g. 0xffe00000) is visible at that address on the A53 cluster. Will fix

> > + memory-region:
> > + description:
> > + collection of memory carveouts used for elf-loading and inter-processor
> > + communication. each carveout in this case should be in DDR, not
> > + chip-specific memory. In Xilinx case, this is TCM, OCM, BRAM, etc.
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
>
> This is nice, you're reusing the infrastructure we already
> have for these carveouts, good design!
>
> > + meta-memory-regions:
> > + description:
> > + collection of memories that are not present in the top level memory
> > + nodes' mapping. For example, R5s' TCM banks. These banks are needed
> > + for R5 firmware meta data such as the R5 firmware's heap and stack.
> > + To be more precise, this is on-chip reserved SRAM regions, e.g. TCM,
> > + BRAM, OCM, etc.
> > + $ref: /schemas/types.yaml#/definitions/phandle-array
>
> Is this in the memory space of the main CPU cluster?
>
> It sure looks like that.
>

Yes this is in the memory space of the A53 cluster

Will fix to comment as such. Thank you

> > + /*
> > + * Below nodes are required if using TCM to load R5 firmware
> > + * if not, then either do not provide nodes are label as disabled in
> > + * status property
> > + */
> > + tcm0a: tcm_0a@ffe00000 {
> > + reg = <0xffe00000 0x10000>;
> > + pnode-id = <0xf>;
> > + no-map;
> > + status = "okay";
> > + phandle = <0x40>;
> > + };
> > + tcm0b: tcm_1a@ffe20000 {
> > + reg = <0xffe20000 0x10000>;
> > + pnode-id = <0x10>;
> > + no-map;
> > + status = "okay";
> > + phandle = <0x41>;
> > + };
>
> All right so this looks suspicious to me. Please explain
> what we are seeing in those reg entries? Is this the address
> seen by the main CPU cluster?
>
> Does it mean that the main CPU see the memory of the
> R5 as "some kind of TCM" and that TCM is physically
> mapped at 0xffe00000 (ITCM) and 0xffe20000 (DTCM)?
>
> If the first is ITCM and the second DTCM that is pretty
> important to point out, since this reflects the harvard
> architecture properties of these two memory areas.
>
> The phandle = thing I do not understand at all, but
> maybe there is generic documentation for it that
> I've missed?
>
> Last time I checked (which was on the ARM32) the physical
> address of the ITCM and DTCM could be changed at
> runtime with CP15 instructions. I might be wrong
> about this, but if that (or something similar) is still the case
> you can't just say hardcode these addresses here, the
> CPU can move that physical address somewhere else.
> See the code in
> arch/arm/kernel/tcm.c
>
> It appears the ARM64 Linux kernel does not have any
> TCM handling today, but that could change.
>
> So is this just regular ARM TCM memory (as seen by
> the main ARM64 cluster)?
>
> If this is the case, you should probably add back the
> compatible string, add a separate device tree binding
> for TCM memories along the lines of
> compatible = "arm,itcm";
> compatible = "arm,dtcm";
> The reg address should then ideally be interpreted by
> the ARM64 kernel and assigned to the I/DTCM.
>
> I'm paging Catalin on this because I do not know if
> ARM64 really has [I|D]TCM or if this is some invention
> of Xilinx's.
>
> Yours,
> Linus Walleij

As you said, this is just regular ARM TCM memory (as seen by the main ARM64 cluster). Yes I can add back the compatible string, though maybe just "tcm" or "xlnx,tcm" though there is also tcm compat string for the TI remoteproc driver later listed below

So I think I answered this above, but the APU (a53 cluster) sees the TCM banks (referred to on Zynq UltraScale+) at TCM banks 0A and 0B as 0xffe00000 and 0xffe20000 respectively and TCM banks 1A and 1B at 0xffe90000 and 0xffeb0000. Also it is similar to the TI k3 R5 Remoteproc driver binding for reference:
- https://patchwork.kernel.org/cover/11763783/

The phandle array here is to serve as a later for these nodes so that later on, in the Remoteproc R5 driver, these can be referenced to get some information via the pnode-id property.

The reason we have the compatible, reg, etc. is for System DT effort + @Stefano Stabellini

That being said, we can change this around to couple the TCM bank nodes into the R5 as we have In our present, internal implementation at
- https://github.com/Xilinx/linux-xlnx/blob/master/Documentation/devicetree/bindings/remoteproc/xilinx%2Czynqmp-r5-remoteproc.txt
- https://github.com/Xilinx/linux-xlnx/blob/master/drivers/remoteproc/zynqmp_r5_remoteproc.c
the TCM nodes are coupled in the R5 but after some previous review on this list, it was moved to have the TCM nodes decoupled from the R5 node

I am not sure what you mean on the Arm64 handling of TCM memory. Via the architecture of the SoC https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf I know that the A53 cluster can see the absolute addresses of the R5 cluster so the translation is *I think* done as you describe with the CP15 instructions you listed.