RE: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a document describing Alveo XRT drivers

From: Max Zhen
Date: Wed Dec 02 2020 - 22:39:37 EST


Hi Moritz,

Please see my reply below.

Thanks,
-Max

> -----Original Message-----
> From: Moritz Fischer <mdf@xxxxxxxxxx>
> Sent: Wednesday, December 2, 2020 15:10
> To: Max Zhen <maxz@xxxxxxxxxx>
> Cc: Moritz Fischer <mdf@xxxxxxxxxx>; Sonal Santan <sonals@xxxxxxxxxx>;
> linux-kernel@xxxxxxxxxxxxxxx; linux-fpga@xxxxxxxxxxxxxxx; Lizhi Hou
> <lizhih@xxxxxxxxxx>; Michal Simek <michals@xxxxxxxxxx>; Stefano
> Stabellini <stefanos@xxxxxxxxxx>; devicetree@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a
> document describing Alveo XRT drivers
>
>
> Hi Max,
>
> On Wed, Dec 02, 2020 at 09:24:29PM +0000, Max Zhen wrote:
> > Hi Moritz,
> >
> > Thanks for your feedback. Please see my reply inline.
> >
> > Thanks,
> > -Max
> >
> > > -----Original Message-----
> > > From: Moritz Fischer <mdf@xxxxxxxxxx>
> > > Sent: Monday, November 30, 2020 20:55
> > > To: Sonal Santan <sonals@xxxxxxxxxx>
> > > Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-fpga@xxxxxxxxxxxxxxx; Max
> > > Zhen <maxz@xxxxxxxxxx>; Lizhi Hou <lizhih@xxxxxxxxxx>; Michal
> > > Simek <michals@xxxxxxxxxx>; Stefano Stabellini
> > > <stefanos@xxxxxxxxxx>; devicetree@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a
> > > document describing Alveo XRT drivers
> > >
> > >
> > > On Sat, Nov 28, 2020 at 04:00:33PM -0800, Sonal Santan wrote:
> > > > From: Sonal Santan <sonal.santan@xxxxxxxxxx>
> > > >
> > > > Describe Alveo XRT driver architecture and provide basic
> > > > overview of Xilinx Alveo platform.
> > > >
> > > > Signed-off-by: Sonal Santan <sonal.santan@xxxxxxxxxx>
> > > > ---
> > > > Documentation/fpga/index.rst | 1 +
> > > > Documentation/fpga/xrt.rst | 588
> > > +++++++++++++++++++++++++++++++++++
> > > > 2 files changed, 589 insertions(+) create mode 100644
> > > > Documentation/fpga/xrt.rst
> > > >

[...cut...]

> > > > +xclbin over the User partition as part of DFX. When a user
> > > > +requests loading of a specific xclbin the xmgmt management
> > > > +driver reads the parent interface UUID specified in the xclbin
> > > > +and matches it with child interface UUID exported by Shell to
> > > > +determine if xclbin is compatible
> > > with the Shell. If match fails loading of xclbin is denied.
> > > > +
> > > > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl
> command.
> > > > +When loading xclbin xmgmt driver performs the following operations:
> > > > +
> > > > +1. Sanity check the xclbin contents 2. Isolate the User
> > > > +partition 3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > > > +De-isolate the User partition
> > > Is this modelled as bridges and regions?
> >
> > Alveo drivers as written today do not use fpga bridge and region
> framework. It seems that if we add support for that framework, it’s
> possible to receive PR program request from kernel outside of xmgmt driver?
> Currently, we can’t support this and PR program can only be initiated
> using XRT’s runtime API in user space.
>
> I'm not 100% sure I understand the concern here, let me reply to what
> I think I understand:
>
> You're worried that if you use FPGA region as interface to accept PR
> requests something else could attempt to reconfigure the region from
> within the kernel using the FPGA Region API?
>
> Assuming I got this right, I don't think this is a big deal. When you
> create the regions you control who gets the references to it.

Thanks for explaining. Yes, I think you got my point :-).

>
> From what I've seen so far Regions seem to be roughly equivalent to
> Partitions, hence my surprise to see a new structure bypassing them.

I see where the gap is.

Regions in Linux is very different than "partitions" we have defined in xmgmt. Regions seem to be a software data structure representing an area on the FPGA that can be reprogrammed. This area is protected by the concept of "bridge" which can be disabled before program and reenabled after it. And you go through region when you need to reprogram this area.

The "partition" is part of the main infrastructure of xmgmt driver, which represents a group of subdev drivers for each individual IP (HW subcomponents). Basically, xmgmt root driver is parent of several partitions who is, in turn, the parent of several subdev drivers. The parent manages the life cycle of its children here.

We do have a partition to represent the group of subdevs/IPs in the reprogrammable area. And we also have partitions representing other areas which cannot be reprogrammed. So, it is difficult to use "Region" to implement "partition".

From what you have explained, it seems that even if I use region / bridge in xmgmt, we can still keep it private to xmgmt instead of exposing the interface to outside world, which we can't support anyway? This means that region will be used as an internal data structure for xmgmt. Since we can't simply replace partition with region, we might as well just use partition throughout the driver, instead of introducing two data structures and use them both in different places.

However, if using region/bridge can bring in other benefits, please let us know and we could see if we can also add this to xmgmt.

> >
> > Or maybe we have missed some points about the use case for this
> framework?
> >

[...cut...]

> > > > +-----------------
> > > > +
> > > > +As mentioned previously xsabin stores metadata which advertise
> > > > +HW
> > > subsystems present in a partition.
> > > > +The metadata is stored in device tree format with well defined
> > > > +schema. Subsystem instantiations are captured as children of
> > > > +``addressable_endpoints`` node. Subsystem nodes have standard
> > > attributes like ``reg``, ``interrupts`` etc. Additionally the
> > > nodes also have PCIe specific attributes:
> > > > +``pcie_physical_function`` and ``pcie_bar_mapping``. These
> > > > +identify which PCIe physical function and which BAR space in
> > > > +that physical function the subsystem resides. XRT management
> > > > +driver uses this information to bind *platform drivers* to the
> > > > +subsystem instantiations. The platform drivers are found in
> > > > +**xrt-lib.ko** kernel module defined later. Below is an example
> > > > +of device tree for Alveo U50
> > > > +platform::
> > >
> > > I might be missing something, but couldn't you structure the
> > > addressable endpoints in a way that encode the physical function
> > > as a parent / child relation?
> >
> > Alveo driver does not generate the metadata. The metadata is
> > formatted
> and generated by HW tools when the Alveo HW platform is built.
>
> Sure, but you control the tools that generate the metadata :) Your
> userland can structure / process it however it wants / needs?

XRT is a runtime software stack, it is not responsible for generating HW metadata. It is one of the consumers of these data. The shell design is generated by a sophisticated tool framework which is difficult to change.

However, we will take this as a feedback for future revision of the tool.

Thanks,
Max