RE: [PATCH v18 4/5] dt-bindings: remoteproc: Add documentation for ZynqMP R5 rproc bindings
From: Ben Levinsky
Date: Thu Oct 08 2020 - 12:45:44 EST
Hi Sorry there were some typos and errors in my response so I'll correct them below:
> -----Original Message-----
> From: Ben Levinsky <BLEVINSK@xxxxxxxxxx>
> Sent: Thursday, October 8, 2020 7:21 AM
> To: Linus Walleij <linus.walleij@xxxxxxxxxx>; Catalin Marinas
> <catalin.marinas@xxxxxxx>; Stefano Stabellini <stefanos@xxxxxxxxxx>; Ed T.
> Mooring <emooring@xxxxxxxxxx>
> Cc: Ed T. Mooring <emooring@xxxxxxxxxx>; sunnyliangjy@xxxxxxxxx; Punit
> Agrawal <punit1.agrawal@xxxxxxxxxxxxx>; Michal Simek
> <michals@xxxxxxxxxx>; michael.auchter@xxxxxx; open list:OPEN FIRMWARE
> AND FLATTENED DEVICE TREE BINDINGS <devicetree@xxxxxxxxxxxxxxx>;
> Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>; linux-
> remoteproc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Rob Herring
> <robh+dt@xxxxxxxxxx>; Linux ARM <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx>
> Subject: RE: [PATCH v18 4/5] dt-bindings: remoteproc: Add documentation for
> ZynqMP R5 rproc bindings
>
> Hi Linus,
>
> Thanks for the review
>
> Please see my comments inline
>
> >
> > Hi Ben,
> >
> > thanks for your patch! I noticed this today and pay some interest
> > because in the past I used with implementing the support for
> > TCM memory on ARM32.
> >
> > On Mon, Oct 5, 2020 at 6:06 PM Ben Levinsky <ben.levinsky@xxxxxxxxxx>
> > wrote:
> >
> > > Add binding for ZynqMP R5 OpenAMP.
> > >
> > > Represent the RPU domain resources in one device node. Each RPU
> > > processor is a subnode of the top RPU domain node.
> > >
> > > Signed-off-by: Jason Wu <j.wu@xxxxxxxxxx>
> > > Signed-off-by: Wendy Liang <jliang@xxxxxxxxxx>
> > > Signed-off-by: Michal Simek <michal.simek@xxxxxxxxxx>
> > > Signed-off-by: Ben Levinsky <ben.levinsky@xxxxxxxxxx>
> > (...)
> >
> > > +title: Xilinx R5 remote processor controller bindings
> > > +
> > > +description:
> > > + This document defines the binding for the remoteproc component that
> > loads and
> > > + boots firmwares on the Xilinx Zynqmp and Versal family chipset.
> >
> > ... firmwares for the on-board Cortex R5 of the Zynqmp .. (etc)
> >
> Will fix
> > > +
> > > + Note that the Linux has global addressing view of the R5-related
> memory
> > (TCM)
> > > + so the absolute address ranges are provided in TCM reg's.
> >
> > Please do not refer to Linux in bindings, they are also for other
> > operating systems.
> >
> > Isn't that spelled out "Tightly Coupled Memory" (please expand the
> acronym).
> >
> > I had a hard time to parse this description, do you mean:
> >
> > "The Tightly Coupled Memory (an on-chip SRAM) used by the Cortex R5
> > is double-ported and visible in both the physical memory space of the
> > Cortex A5 and the memory space of the main ZynqMP processor
> > cluster. This is visible in the address space of the ZynqMP processor
> > at the address indicated here."
> >
> > That would make sense, but please confirm/update.
> >
>
> Yes the TCM address space as noted in the TCM device tree nodes (e.g.
> 0xffe00000) is visible at that address on the A53 cluster. Will fix
>
> > > + memory-region:
> > > + description:
> > > + collection of memory carveouts used for elf-loading and inter-
> processor
> > > + communication. each carveout in this case should be in DDR, not
> > > + chip-specific memory. In Xilinx case, this is TCM, OCM, BRAM, etc.
> > > + $ref: /schemas/types.yaml#/definitions/phandle-array
> >
> > This is nice, you're reusing the infrastructure we already
> > have for these carveouts, good design!
> >
> > > + meta-memory-regions:
> > > + description:
> > > + collection of memories that are not present in the top level memory
> > > + nodes' mapping. For example, R5s' TCM banks. These banks are
> needed
> > > + for R5 firmware meta data such as the R5 firmware's heap and stack.
> > > + To be more precise, this is on-chip reserved SRAM regions, e.g. TCM,
> > > + BRAM, OCM, etc.
> > > + $ref: /schemas/types.yaml#/definitions/phandle-array
> >
> > Is this in the memory space of the main CPU cluster?
> >
> > It sure looks like that.
> >
>
> Yes this is in the memory space of the A53 cluster
>
> Will fix to comment as such. Thank you
>
> > > + /*
> > > + * Below nodes are required if using TCM to load R5 firmware
> > > + * if not, then either do not provide nodes are label as disabled in
> > > + * status property
> > > + */
> > > + tcm0a: tcm_0a@ffe00000 {
> > > + reg = <0xffe00000 0x10000>;
> > > + pnode-id = <0xf>;
> > > + no-map;
> > > + status = "okay";
> > > + phandle = <0x40>;
> > > + };
> > > + tcm0b: tcm_1a@ffe20000 {
> > > + reg = <0xffe20000 0x10000>;
> > > + pnode-id = <0x10>;
> > > + no-map;
> > > + status = "okay";
> > > + phandle = <0x41>;
> > > + };
> >
> > All right so this looks suspicious to me. Please explain
> > what we are seeing in those reg entries? Is this the address
> > seen by the main CPU cluster?
> >
> > Does it mean that the main CPU see the memory of the
> > R5 as "some kind of TCM" and that TCM is physically
> > mapped at 0xffe00000 (ITCM) and 0xffe20000 (DTCM)?
> >
> > If the first is ITCM and the second DTCM that is pretty
> > important to point out, since this reflects the harvard
> > architecture properties of these two memory areas.
> >
> > The phandle = thing I do not understand at all, but
> > maybe there is generic documentation for it that
> > I've missed?
> >
> > Last time I checked (which was on the ARM32) the physical
> > address of the ITCM and DTCM could be changed at
> > runtime with CP15 instructions. I might be wrong
> > about this, but if that (or something similar) is still the case
> > you can't just say hardcode these addresses here, the
> > CPU can move that physical address somewhere else.
> > See the code in
> > arch/arm/kernel/tcm.c
> >
> > It appears the ARM64 Linux kernel does not have any
> > TCM handling today, but that could change.
> >
> > So is this just regular ARM TCM memory (as seen by
> > the main ARM64 cluster)?
> >
> > If this is the case, you should probably add back the
> > compatible string, add a separate device tree binding
> > for TCM memories along the lines of
> > compatible = "arm,itcm";
> > compatible = "arm,dtcm";
> > The reg address should then ideally be interpreted by
> > the ARM64 kernel and assigned to the I/DTCM.
> >
> > I'm paging Catalin on this because I do not know if
> > ARM64 really has [I|D]TCM or if this is some invention
> > of Xilinx's.
> >
> > Yours,
> > Linus Walleij
>
> As you said, this is just regular ARM TCM memory (as seen by the main
> ARM64 cluster). Yes I can add back the compatible string, though maybe just
> "tcm" or "xlnx,tcm" though there is also tcm compat string for the TI
> remoteproc driver later listed below
>
> So I think I answered this above, but the APU (a53 cluster) sees the TCM
> banks (referred to on Zynq UltraScale+) at TCM banks 0A and 0B as 0xffe00000
> and 0xffe20000 respectively and TCM banks 1A and 1B at 0xffe90000 and
> 0xffeb0000. Also it is similar to the TI k3 R5 Remoteproc driver binding for
> reference:
> - https://patchwork.kernel.org/cover/11763783/
>
> The phandle array here is to serve as a later for these nodes so that later on,
> in the Remoteproc R5 driver, these can be referenced to get some information
> via the pnode-id property.
>
> The reason we have the compatible, reg, etc. is for System DT effort +
> @Stefano Stabellini
>
> That being said, we can change this around to couple the TCM bank nodes
> into the R5 as we have In our present, internal implementation at
> - https://github.com/Xilinx/linux-
> xlnx/blob/master/Documentation/devicetree/bindings/remoteproc/xilinx%2C
> zynqmp-r5-remoteproc.txt
> - https://github.com/Xilinx/linux-
> xlnx/blob/master/drivers/remoteproc/zynqmp_r5_remoteproc.c
> the TCM nodes are coupled in the R5 but after some previous review on this
> list, it was moved to have the TCM nodes decoupled from the R5 node
>
> I am not sure what you mean on the Arm64 handling of TCM memory. Via the
> architecture of the SoC
> https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-
> ultrascale-trm.pdf I know that the A53 cluster can see the absolute addresses
> of the R5 cluster so the translation is *I think* done as you describe with the
> CP15 instructions you listed.
>
Sorry the above is unclear! editing this for clarity:
The phandle in the TCM node is referenced in the "meta-memory-regions" property. It is used as TCM, BRAM, OCM or other SoC-specific on-chip memories may be used for firmware loading other than the generic DDR.
The TCM nodes' originally had a compatible string, but was removed after a comment from the device tree maintainer Rob H. commented that TCM was not specific to Xilinx. With that in mind, I am not sure if the "arm,itcm" or "arm,dtcm" are appropriate.
The addresses for the TCM banks are addressable with these values provided in the reg via a physical bus not an instruction AFAIK so the CP15 instruction is not used.
Apologies again for the previously ambiguous response and hope this addresses your concerns,
Ben