Re: [PATCH v2 2/2] firmware: arm_scmi: round rate bisecting in discrete rates

From: Etienne CARRIERE - foss
Date: Tue Dec 10 2024 - 05:57:01 EST


On Monday, December 9, 2024, Sudeep Holla wrote:
> On Mon, Dec 09, 2024 at 12:59:58PM +0000, Etienne CARRIERE - foss wrote:
> > Hello Sudeep,
> >
> > On Monday, December 9, 2024 11:46 AM, Sudeep Holla wrote:
> > > On Tue, Dec 03, 2024 at 06:39:08PM +0100, Etienne Carriere wrote:
> > > > Implement clock round_rate operation for SCMI clocks that describe a
> > > > discrete rates list. Bisect into the supported rates when using SCMI
> > > > message CLOCK_DESCRIBE_RATES to optimize SCMI communication transfers.
> > >
> > > Let me stop here and try to understand the requirement here. So you do
> > > communicate with the firmware to arrive at this round_rate ? Does the
> > > list of discreet clock rates changes at the run-time that enables the
> > > need for it. Or will the initial list just include max and min ?
> >
> > I don't expect the list to change at run-time. The initial list is
> > expected to describe all supported rates. But because this list may
> > be big, I don't think arm_scmi/clock.c driver should store the full list
> > of all supported rates for each of the SCMI clocks. It would cost to
> > much memory. Therefore I propose to query it at runtime, when
> > needed, and bisect to lower the number of required transactions
> > between the agent and the firmware.
> >
>
> Ah so, this is nothing to do with set_parent, but just an optimisation.
> This change optimises for space but some other platform may have all the
> space but the communication with SCMI platform is not good enough to make
> runtime calls like this change. How do we cater that then ?

This change does not optimize memory. It implements a real clk_round_rate()
operation for SCMI clocks that have a discrete supported rates list. The
existing implementation does not support it, it behaves as if the
requested clock is supported and let caller change the clock rate to
find out which rounded rate it effectively gets. This does not suit
audio and video clock constraints.

How to deal between platforms with large memory/slow SCMI
communication and those with the opposite? I think the easiest way
would be to have a dedicated SCMI Clock protocol command.

>
> We need some spec-ed way or a unique way to identify what is best for
> the platform IMO. We can change the way you have done in this change set
> as someone else may complain in the future that it is costly to send
> such command every time a clock needs to be set. I am just guessing here
> may not be true.
>
> > >
> > > > Parse the rate list array when the target rate fit in the bounds
> > > > of the command response for simplicity.
> > > >
> > >
> > > I don't understand what you mean by this.
> >
> > I meant here that we bisect into supported rates when communicating
> > with the firmware but once the firmware response provides list portion
> > when target rate fits into, we just scan into that array instead of bisecting
> > into. We could also bisect into that array but it is likely quite small
> > (<128 byte in existing SCMI transport drivers) and that would add a bit
> > more code for no much gain IMHO.
> >
> >
> > >
> > > > If so some reason the sequence fails or if the SCMI driver has no
> > > > round_rate SCMI clock handler, then fallback to the legacy strategy that
> > > > returned the target rate value.
> > > >
> > >
> > > Hmm, so we perform some extra dance but we are okay to fallback to default.
> > > I am more confused.
> >
> > Here, I propose to preserve the exiting sequence in clk/clk-scmi.c in case
> > arm_scmi/clock.c does not implement this new round_rate SCMI clock
> > operation (it can be the case if these 2 drivers are .ko modules, not
> > well known built-in drivers).
> >
>
> I don't think it would work if it is not built on the same kernel anyways.
> I don't work much about this use-case.

Using the same kernel will not enforce the driver was not modified regarding
the vanilla upstream version. This may be also true for built-in modules
I guess.

>
> > >
> > > > Operation handle scmi_clk_determine_rate() is change to get the effective
> > > > supported rounded rate when there is no clock re-parenting operation
> > > > supported. Otherwise, preserve the implementation that assumed any
> > > > clock rate could be obtained.
> > > >
> > >
> > > OK, no I think I am getting some idea. Is this case where the parent has
> > > changed and the describe rates can give a different result at run-time.
> >
> > This does not deal with whether parent has changed or not. I would expect
> > the same request sent multiple times to provide the very same result. But
> > as I said above, I don't think arm_scmi/clock.c should consume a possibly
> > large array of memory to store all supported rate each of the SCMI clocks
> > (that describe discrete rates).
> >
>
> Right, my assumption was totally wrong. Thanks for confirming.
>
> > An alternate way could be to add an SCMI Clock protocol command in the
> > spec allowing agent to query a closest supported rate, in 1 shot. Maybe
> > this new command could return both rounded rate and the SCMI parent
> > clock needed to reach that rounded rate, better fitting clk_determine_rate()
> > expectations.
> >
>
> May be that would be ideal but you need to make a case for such a spec change.

We need effective round_rate support for STM32MP2 platforms where audio
and video clocks are provided by a clock exposed by the SCMI server. These
drivers detect the (possibly external) device needs at runtime and need
to select an input clock that fits some constraints for quality reason.
Audio quality is the most sensible to clock rate inaccuracy.

>
> > >
> > > I need to re-read the part of the spec, but we may need some clarity so
> > > that this implementation is not vendor specific. I am yet to understand this
> > > fully. I just need to make sure spec covers this aspect and anything we
> > > add here is generic solution.
> > >
> > > I would like to avoid this extra query if not required which you seem to
> > > have made an attempt but I just want to be thorough and make sure that's
> > > what we need w.r.t the specification.
> >
> > Sure, I indeed prefer clear and robust implementation in the long term,
> > being the one I propose here or another one.
> >
>
> Good then, we can work towards achieving that. If you can specify how slow
> or memory hungry is it without these changes and how much this change helps
> your platform, we can take it up with spec authors and see if they are happy
> to provide some alternative to deal with this in a generic way.

The platforms we target usually have plenty of RAM, lets say hundreds of MBytes.
Not that much for some system but enough I guess to store a few hundreds of
clock rates for a few dozen of clocks (few kByte of RAM).

That said, thinking more and more about this, I really belive a dedicate SCMI
clock protocol command would better fit platform needs in the long term.

BR,
Etienne

>
> --
> Regards,
> Sudeep
>