Re: [RFC PATCH] remoteproc: k3-r5: Fix check performed in k3_r5_rproc_{mbox_callback/kick}

From: Mathieu Poirier
Date: Thu Sep 19 2024 - 04:33:41 EST


s

On Tue, 17 Sept 2024 at 03:40, Beleswar Prasad Padhi <b-padhi@xxxxxx> wrote:
>
> Hi Mathieu,
>
> On 17/09/24 14:07, Mathieu Poirier wrote:
> > On Mon, 16 Sept 2024 at 23:20, Kumar, Udit <u-kumar1@xxxxxx> wrote:
> >> On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
> >>> On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@xxxxxx> wrote:
> >>>> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
> >>>> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
> >>>> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
> >>>> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
> >>>> responsible for attaching to a remote core, updates the state of the remote
> >>>> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
> >>>>
> >>>> The "rproc_start_subdevices()" function triggers the probe of the Virtio
> >>>> RPMsg devices associated with the remote core, which require that the
> >>>> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
> >>>> functional. Hence, drop the check in the callbacks.
> >>> Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.
> >>
> >> Please don't :) , it will break rproc in general for k3 devices.
> >>
> > Why not - it is already broken anyway. Reverting the patches will
> > force TI to actually think about the feature in terms of design,
> > completeness and testing. The merge window opened on Sunday - I'm not
> > going to merge whack-a-mole patches and hope the right fix comes
> > along.
>
>
> Apologies for causing this trouble, Mathieu. I have accumulated various
> use-cases of the driver, including this, and hereon will keep in mind
> while posting further patches.
>
> >
> >> Couple of solutions for this race around condition (in mine preference
> >> order)
> >>
> > This is for the TI team to discuss _and_ test thoroughly. From hereon
> > and until I see things improve, all patches from TI will need to be
> > tagged with R-B and T-B tags (collected on the mailing lists) from two
> > different individuals before I look at them.
>
>
> Understood, that is a fair ask. Hereon, I will also attach my test logs
> for all the usecases I've tested a patch with, to give more visibility
> on the testing done.
>

I am not in a position (nor have the time) to look at test logs and as
such, there is no point in doing that. That time should be better
spent building a CI that goes through all the scenarios and making
sure that all patches pass the tests before being sent to the mailing
list.

> >
> >> 1) In
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190
> >> have a check , if probe in is progress or not
> >>
> >> 2)
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205
> >> -- correct the state to ON or something else
> >>
> >> 3) Move condition
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360
> >> before rproc_start_subdevices
> >> <https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices>
> >> calling
> >>
> >>
> >>
> >>>> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
> >>>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@xxxxxx>
> >>>> ---
> >>>>
> >>>> Hello,
> >>>>
> >>>> Since the commit being fixed is not yet a part of Mainline Linux, this
> >>>> patch is based on linux-next tagged next-20240913.
> >>>>
> >>>> An alternative to this patch will be a change to the "__rproc_attach()"
> >>>> function in the "remoteproc_core.c" driver with
> >>>> rproc->state = RPROC_ATTACHED;
> >>>> being set after "rproc_attach_device()" is invoked, but __before__
> >>>> invoking "rproc_start_subdevices()". Since this change will be performed
> >>>> in the common Remoteproc Core, it appeared to me that fixing it in the
> >>>> TI remoteproc driver is the correct approach.
> >>>>
> >>>> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
> >>>> required, which I shall post if the current patch is acceptable.
> >>>>
> >>>> Kindly review and share your feedback on this patch.
> >>>>
> >>>> Regards,
> >>>> Siddharth.
> >>>>
> >>>> drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
> >>>> 1 file changed, 8 deletions(-)
> >>>>
> >>>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> index 747ee467da88..4894461aa65f 100644
> >>>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
> >>>> const char *name = kproc->rproc->name;
> >>>> u32 msg = omap_mbox_message(data);
> >>>>
> >>>> - /* Do not forward message from a detached core */
> >>>> - if (kproc->rproc->state == RPROC_DETACHED)
> >>>> - return;
> >>>> -
> >>>> dev_dbg(dev, "mbox msg: 0x%x\n", msg);
> >>>>
> >>>> switch (msg) {
> >>>> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
> >>>> mbox_msg_t msg = (mbox_msg_t)vqid;
> >>>> int ret;
> >>>>
> >>>> - /* Do not forward message to a detached core */
> >>>> - if (kproc->rproc->state == RPROC_DETACHED)
> >>>> - return;
> >>>> -
> >>>> /* send the index of the triggered virtqueue in the mailbox payload */
> >>>> ret = mbox_send_message(kproc->mbox, (void *)msg);
> >>>> if (ret < 0)
> >>>> --
> >>>> 2.40.1
> >>>>