Re: [PATCH] firmware: arm_scmi: add timeout in do_xfer_with_response()

From: Sudeep Holla
Date: Wed Apr 09 2025 - 07:12:26 EST


On Thu, Apr 03, 2025 at 10:50:17PM +0300, Matthew Bystrin wrote:
> Hi Sudeep, Cristian,
>
> Thanks for having a look on the patch.
>
> Cristian Marussi, Apr 02, 2025 at 19:05:
> > > Please post this patch along with the vendor specific protocols mentioned
> > > above and with the reasoning as why 2s is not sufficient.
> >
> > Ack on this, it would be good to understand why a huge 2 secs is not
> > enough...and also...
>
> I've been working on firmware update using SCMI vendor/platform-specific
> extension on FPGA prototype, so not posted it initially. I'm open to share the
> details if needed, but need some extra time for preparations. For now I'm
> posting a brief description of the extension. It has 2 commands:
>
> - Obtain firmware version number.
> - Update firmware. Firmware image is placed into shared physically contiguous
> memory, Agent sends to platform micro controller (PuC) physical address and
> size of the update image to start update procedure. After update is completed
> (successfully or not) PuC sends delayed response.
>
> Agent ---- start update ---> Platform uC
> Agent <--- update procedure started ---- Platform uC
> ...
> Agent <--- (async) update completed ---- Platform uC
>
> I've faced timeout problem with the async completion response. And update can't
> be done faster than 10s due to SPI flash write speed limit.
>

Understood.

> Why not to use notifications?
>
> First of all, semantics. IIUC notifications can be sent by PuC in any time. This
> is not suitable for updates, because procedure is initiated by an agent, not by
> a platform.
>

The start update should retain as soon as Platform uC acks the request.
And 2 notifications can be sent out for update procedure started and
completed. I don't see any issue there. What is the semantics you are
talking about ?

> Secondly, code implementing notification waiting duplicates delayed response
> code. I had implemented it as a proof-of-concept before I prepared this patch.
>

Even delayed response as some timeout so I would rather prefer to use
notifications in your usecase as it is completely async.

> > > Also instead of churning up existing users/usage, we can explore to had
> > > one with this timeout as alternative if you present and convince the
> > > validity of your use-case and the associated timing requirement.
> > >
> >
> > ...with the proposed patch (and any kind of alternative API proposed
> > by Sudeep) the delayed response timeout becomes a parameter of the method
> > do_xfer_with_response() and so, as a consequence, this timoeut becomes
> > effectively configurable per-transaction, while usually a timeout is
> > commonly configurable per-channel,
>
> Totally agree, usually it is. And that's why I didn't change do_xfer() call.
> Here is the thing I want to pay attention to.
>
> Let's focus on delayed responses. I think delayed response timeout should not be
> defined by transport but rather should be defined by _function_ PuC providing.
> And of course platform and transport could influence on the timeout value.
>

I think in your case, it is not even transport specific. It is more operation
specific and hence I prefer notifications.

> > so valid as a whole for any protocol
> > on that channel across the whole platform, AND optionally describable as
> > different from the default standard value via DT props (like max-rx-timeout).
> >
> > Is this what we want ? (a per-transaction configurable timeout ?)
> >
> > If not, it could be an option to make instead this a per-channel optional
> > new DT described property so that you can configure globally a different
> > delayed timeout.
>
> Taking into account my previous comment, I don't think that having a per-channel
> timeout for delayed response would solve the problem in the right way. What
> about having a per-protocol timeout at least?
>

Yes neither per-transport nor per-protocol timeout will suffice in your case.
This 10s timeout is specific to the update operation and hence use
notification. All other solution is just workarounds not generic solution.

--
Regards,
Sudeep