Re: [PATCH] remoteproc: Make rproc_get_by_phandle() work for clusters
From: Mathieu Poirier
Date: Wed Jan 04 2023 - 16:47:03 EST
On Wed, Jan 04, 2023 at 09:56:13AM -0600, Bjorn Andersson wrote:
> On Tue, Jan 03, 2023 at 11:48:56AM -0700, Mathieu Poirier wrote:
> > On Tue, 27 Dec 2022 at 08:11, Bjorn Andersson <andersson@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Dec 14, 2022 at 03:16:43PM -0700, Mathieu Poirier wrote:
> > > > Multi-cluster remoteproc designs typically have the following DT
> > > > declaration:
> > > >
> > > > remoteproc_cluster {
> > > > compatible = "soc,remoteproc-cluster";
> > > >
> > > > core0: core0 {
> > > > compatible = "soc,remoteproc-core"
> > > > memory-region;
> > > > sram;
> > > > };
> > > >
> > > > core1: core1 {
> > > > compatible = "soc,remoteproc-core"
> > > > memory-region;
> > > > sram;
> > > > }
> > > > };
> > > >
> > > > A driver exists for the cluster rather than the individual cores
> > > > themselves so that operation mode and HW specific configurations
> > > > applicable to the cluster can be made.
> > > >
> > > > Because the driver exists at the cluster level and not the individual
> > > > core level, function rproc_get_by_phandle() fails to return the
> > > > remoteproc associated with the phandled it is called for.
> > > >
> > > > This patch enhances rproc_get_by_phandle() by looking for the cluster's
> > > > driver when the driver for the immediate remoteproc's parent is not
> > > > found.
> > > >
> > >
> > > Can you please help me understand why zynqmp_r5_remoteproc_probe()
> > > invokes devm_of_platform_populate() to create platform_device instances
> > > for the clusters?
> > >
> >
> > Platform device instances are created for the individual cores found
> > in the cluster, following the work done on TI's K3-R5[1].
> >
>
> Right, and this is a design pattern that I've been bitten by several
> times by now. There's no real purpose of spinning up platform_devices
> for those nodes.
>
Calling of_platform_populate() happened before my time in this subsystem. I
thought you were favourable to it. Can you give one or two examples where it caused
you grief?
> > > Why can't the platform_device for the cluster be used as parent for both
> > > remoteprocs and then the driver deal with the subnodes within the
> > > driver?
> > >
> >
> > That is exactly how things work for both K3-R5 and R5F architectures.
> > That said, if we use the cluster's platform device as parent of the
> > remote processors inside the cluster, function rproc_get_by_phandle()
> > will return the first remote processor it finds with a matching parent
> > rather than the remote processor referenced by the phandle parameter.
> >
>
> I missed the fact that we don't associate either the rproc or the rproc
> device with the of_node, but rather just rely on the fact that
> rproc->dev->parent->of_node is typically is the handle we're looking
> for.
>
> And I don't think we'll return the first instance, because
> rproc->dev->parent->of_node will never match the instance's of_node.
>
My first suggestion was also to use the cluster's device as parent to the remote
processors inside the cluster but it didn't work, though the exact details are
lost in the holiday's fairy dust. Looking more closely at the code I think you
are correct.
>
> I think it would be cleaner to add a of_node to struct rproc and use
> this for matching.
>
I also considered that option but decided to proceed differently because it
duplicates of_node information that is already available and requires
modifications to the drivers already using rproc_get_by_phandle(). Unless
I'm missing something we would still have to call of_platform_populate() to get
the of_node information... And modify the parameters to rproc_alloc(), which
cascades exponentially.
> And I do suggest that we don't of_platform_populate() in the TI driver.
> If nothing else, doing so saves ~2kb of wasted RAM...
>
And that would require a serious refactoring exercise that, in my opinion, far
outweigh the benefits.
Thanks,
Mathieu
> > [1]. https://elixir.bootlin.com/linux/v6.2-rc2/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1728
> >
> > > Regards,
> > > Bjorn
> > >
> > > > Reported-by: Ben Levinsky <ben.levinsky@xxxxxxxxxx>
> > > > Signed-off-by: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
> > > > ---
> > > > drivers/remoteproc/remoteproc_core.c | 28 +++++++++++++++++++++++++++-
> > > > 1 file changed, 27 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > > > index 1cd4815a6dd1..91f82886add9 100644
> > > > --- a/drivers/remoteproc/remoteproc_core.c
> > > > +++ b/drivers/remoteproc/remoteproc_core.c
> > > > @@ -33,6 +33,7 @@
> > > > #include <linux/idr.h>
> > > > #include <linux/elf.h>
> > > > #include <linux/crc32.h>
> > > > +#include <linux/of_platform.h>
> > > > #include <linux/of_reserved_mem.h>
> > > > #include <linux/virtio_ids.h>
> > > > #include <linux/virtio_ring.h>
> > > > @@ -2110,7 +2111,9 @@ EXPORT_SYMBOL(rproc_detach);
> > > > #ifdef CONFIG_OF
> > > > struct rproc *rproc_get_by_phandle(phandle phandle)
> > > > {
> > > > + struct platform_device *cluster_pdev;
> > > > struct rproc *rproc = NULL, *r;
> > > > + struct device_driver *driver;
> > > > struct device_node *np;
> > > >
> > > > np = of_find_node_by_phandle(phandle);
> > > > @@ -2121,7 +2124,30 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> > > > list_for_each_entry_rcu(r, &rproc_list, node) {
> > > > if (r->dev.parent && device_match_of_node(r->dev.parent, np)) {
> > > > /* prevent underlying implementation from being removed */
> > > > - if (!try_module_get(r->dev.parent->driver->owner)) {
> > > > +
> > > > + /*
> > > > + * If the remoteproc's parent has a driver, the
> > > > + * remoteproc is not part of a cluster and we can use
> > > > + * that driver.
> > > > + */
> > > > + driver = r->dev.parent->driver;
> > > > +
> > > > + /*
> > > > + * If the remoteproc's parent does not have a driver,
> > > > + * look for the driver associated with the cluster.
> > > > + */
> > > > + if (!driver) {
> > > > + cluster_pdev = of_find_device_by_node(np->parent);
>
> Doing so also has the added benefit that we don't add an implicitly
> requirement on the rproc-device's parent being a platform_driver.
>
> Regards,
> Bjorn
>
> > > > + if (!cluster_pdev) {
> > > > + dev_err(&r->dev, "can't get parent\n");
> > > > + break;
> > > > + }
> > > > +
> > > > + driver = cluster_pdev->dev.driver;
> > > > + put_device(&cluster_pdev->dev);
> > > > + }
> > > > +
> > > > + if (!try_module_get(driver->owner)) {
> > > > dev_err(&r->dev, "can't get owner\n");
> > > > break;
> > > > }
> > > > --
> > > > 2.25.1
> > > >