Re: [PATCH] remoteproc: xlnx: reset virtio status during attach
From: Mathieu Poirier
Date: Tue Mar 31 2026 - 14:01:57 EST
On Mon, Mar 30, 2026 at 01:43:03PM -0500, Shah, Tanmay wrote:
> Hello,
>
> Thanks for the reviews. Please find my comments below:
>
> On 3/27/2026 2:58 PM, Mathieu Poirier wrote:
> > On Tue, Mar 17, 2026 at 01:12:51PM -0700, Tanmay Shah wrote:
> >> On AMD-Xilinx platforms cortex-A and cortex-R can be configured as
> >> separate subsystems. In this case, both cores can boot independent of
> >> each other. If Linux went through uncontrolled reboot during active
> >> rpmsg communication, then during next boot it can find rpmsg virtio
> >> status not in the reset state. In such case it is important to reset the
> >> virtio status during attach callback and wait for sometime for the
> >> remote to handle virtio driver reset.
> >
> > I understand the user case, but won't that situation happen regardless of
> > whether cores operate sync or split mode?
> >
>
> I want to make it clear that this is not same as Cortex-R cluster
> configured as lockstep vs split mode.
>
> This is different configuration between Cortex-A cores and Cortex-R
> cores. It is a firmware driver configuration of how it treats cortex-A
> and Cortex-R subsystems.
>
> In the firmware driver, we can configure Cortex-A cluster as owner of
> Cortex-R cluster, and in that case, if Cortex-A reboots, the firmware
> will also reboot cortex-R cores. This policy makes Cortex-A as owner of
> Cortex-R cores. In this configuraton this patch is not needed, because
> if Cortex-A reboots then platform management firmware will also reboot
> Cortex-R cores as well and vdev status flag will be always 0.
>
> In another configuration in the firmware driver, Cortex-R cores can be
> independent of Cortex-A cores. This means, Cortex-A is not the owner of
> the Cortex-R cores. Both are independent subsystem. Only in this
> configuration, this patch applies because it is possible that Cortex-A
> reboots while Cortex-R doesn't and Cortex-R still runs as it is.
>
> So only in the second type of configuration, this patch is needed when
> COrtex-A running linux reboots and when driver probes and tries to
> attach it can find that vdev flag is not reset. In the first
> configuartion if linux reboots, then It's guranteed that vdev status
> flag will always be in the reset state.
>
> If you prefer I can extend the commit message with all above details or
> put as comment in the attach() callback. Let me know which do you prefer.
Ok, that clarifies a lot of things. Please add the above as a comment in
attach().
>
> >>
> >> Signed-off-by: Tanmay Shah <tanmay.shah@xxxxxxx>
> >> ---
> >> drivers/remoteproc/xlnx_r5_remoteproc.c | 46 +++++++++++++++++++++++++
> >> 1 file changed, 46 insertions(+)
> >>
> >> diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c b/drivers/remoteproc/xlnx_r5_remoteproc.c
> >> index 50a9974f3202..f08806f13800 100644
> >> --- a/drivers/remoteproc/xlnx_r5_remoteproc.c
> >> +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c
> >> @@ -5,6 +5,7 @@
> >> */
> >>
> >> #include <dt-bindings/power/xlnx-zynqmp-power.h>
> >> +#include <linux/delay.h>
> >> #include <linux/dma-mapping.h>
> >> #include <linux/firmware/xlnx-zynqmp.h>
> >> #include <linux/kernel.h>
> >> @@ -29,6 +30,8 @@
> >> #define RSC_TBL_XLNX_MAGIC ((uint32_t)'x' << 24 | (uint32_t)'a' << 16 | \
> >> (uint32_t)'m' << 8 | (uint32_t)'p')
> >>
> >> +#define RPROC_ATTACH_TIMEOUT_US (100 * 1000)
> >> +
> >
> > There are some time constant already defined, please use them.
>
> Ack.
>
> >
> >> /*
> >> * settings for RPU cluster mode which
> >> * reflects possible values of xlnx,cluster-mode dt-property
> >> @@ -865,6 +868,49 @@ static int zynqmp_r5_get_rsc_table_va(struct zynqmp_r5_core *r5_core)
> >>
> >> static int zynqmp_r5_attach(struct rproc *rproc)
> >> {
> >> + struct device *dev = &rproc->dev;
> >> + bool wait_for_remote = false;
> >> + struct fw_rsc_vdev *rsc;
> >> + struct fw_rsc_hdr *hdr;
> >> + int i, offset, avail;
> >> +
> >> + if (!rproc->table_ptr)
> >> + goto attach_success;
> >> +
> >> + for (i = 0; i < rproc->table_ptr->num; i++) {
> >> + offset = rproc->table_ptr->offset[i];
> >> + hdr = (void *)rproc->table_ptr + offset;
> >> + avail = rproc->table_sz - offset - sizeof(*hdr);
> >> + rsc = (void *)hdr + sizeof(*hdr);
> >> +
> >> + /* make sure table isn't truncated */
> >> + if (avail < 0) {
> >> + dev_err(dev, "rsc table is truncated\n");
> >> + return -EINVAL;
> >> + }
> >> +
> >> + if (hdr->type != RSC_VDEV)
> >> + continue;
> >> +
> >> + /*
> >> + * reset vdev status, in case previous run didn't leave it in
> >> + * a clean state.
> >> + */
> >> + if (rsc->status) {
> >> + rsc->status = 0;
> >> + wait_for_remote = true;
> >> + break;
> >> + }
> >> + }
> >> +
> >> + /* kick remote to notify about attach */
> >> + rproc->ops->kick(rproc, 0);
> >> +
> >> + /* wait for sometime until remote is ready */
> >> + if (wait_for_remote)
> >> + usleep_range(100, RPROC_ATTACH_TIMEOUT_US);
> >
> > Instead of waiting, would it be possible to return -EPROBE_DEFER and let the
> > driver core retry mechanic do it's work?
> >
>
> It is not possible to do -EPROBE_DEFER, because attach() callback is not
> called only during driver probe.
>
> It is also called during following command sequence:
>
> attach() -> detach() -> attach()
>
> During second attach() callback, we can't do -EPROBE_DEFER, as it's not
> driver probe anymore. So I think will have to keep the usleep_range().
Right, but in this case the Cortex-A did not go through an uncontrolled reboot,
we know the state of the machine is sound. Do you see a scenario where it would
not be the case?
>
> > Thanks,
> > Mathieu
> >
> >> +
> >> +attach_success:
> >> dev_dbg(&rproc->dev, "rproc %d attached\n", rproc->index);
> >>
> >> return 0;
> >>
> >> base-commit: d4ef36fbd57e610d4c334123ce706a2a71187cae
> >> --
> >> 2.34.1
> >>
>