**Next message:**Frank Oltmanns: "[RFC PATCH 2/3] clk: sunxi-ng: Implement precalculated NKM rate selection"**Previous message:**Marc Zyngier: "Re: [Question about gic vmovp cmd] Consider adding VINVALL after VMOVP"**Next in thread:**Frank Oltmanns: "[RFC PATCH 2/3] clk: sunxi-ng: Implement precalculated NKM rate selection"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

I would like to bring your attention to the current process of setting the rate

of an NKM clock. As it stands, when setting the rate of an NKM clock, the rate

nearest but less than or equal to the requested rate is found, instead of the

nearest rate. Moreover, ccu_nkm_find_best() is called multiple times (footnote

[1]) when setting a rate, each time iterating over all combinations of n, k, and

m.

In response to this, I propose the following refinements to optimize the NKM

clock setting:

a. when finding the best rate use the nearest rate, even if it is greater than

the requested rate (PATCH 1)

b. utilize binary search to find the best rate by going through a

precalculated, ordered list of all meaningful combinations of n, k, and m

(PATCH 2)

To illustrate, consider an NKM clock with min_n = 1, max_n = 16, min_k = 2,

max_k = 4, min_m = 1, and max_m = 16. With the current approach, we have to go

through 1024 combinations, of which only 275 are actually meaningful (the

remaining 749 are combinations that result in the same frequency as other

combinations). So, when selecting from these sorted 275 combinations we find the

closest rate after 9 instead of 1024 steps.

As an example, I calculated the table off-line for the pll-mipi clock of

sun50i-a64 (PATCH 3). However, I have identified two other potential strategies:

2. calculate before first use and persist for subsequent uses (i.e. never free,

see footnote [2])

3. calculate before first use and free after setting the rate.

Each approach carries its own merits. The one I implemented is the most

efficient in terms of computation time but consumes the most memory. The second

saves compute time after the initial use, while the third minimizes memory usage

at the cost of additional computation.

The motivation for these proposed changes lies in the current behavior of rate

selection for NKM clocks, which doesn't observe the CLK_SET_RATE_PARENT flag.

I.e. it does not select a different rate for the parent clock to find the

optimal rate. I believe this is likely due to the fact that selecting a new rate

is quite compute intense, as it would involve iterating or calculating rates for

the parent clock and then testing each rate with different n, k, and m

combinations.

As an example, if the parent is an NM clock, we would have to work through the

combinations of the parent's factors (the parent's n) and divisor (the parent's

m). This results in five nested loops to evaluate all possible rates, an effort

that escalates if the parent could also influence the grandparent's rate. In my

example case (sun50i-a64) the pll-mipi's parent (pll-video0) has 2048

combinations of n and m, of which 1266 are meaningful because the others result

in the same frequency for pll-video0.

If we can come up with a way to iterate over the possible rates of a parent,

this would eventually allow us to make NKM clocks obey the CLK_SET_RATE_PARENT

flag. Because it would only require 11,349 (9 * 1,266) steps instead of

2,097,152 (1,024 * 2,048).

Things I considered but don't have a clear answer to:

- Is there a specific reason, why currently only clock rates less than the

requested rate are considered when setting a new rate?

- Do you think it is worth the memory and increased complexity to be able to

change the parent's clock rate?

I look forward to hearing your thoughts on these proposed changes. Thank you for

your time and consideration.

Footnotes:

[1] Multiple times because ccu_nkm_find_best is (indirectly) called from clk.c

in

- clk_core_req_round_rate_nolock()

- clk_calc_new_rates() (which in turn is called three times for reasons that

currently elude me)

- clk_change_rate

[2] Actually, we could free the memory in a new ccu_nkm_terminate() function,

which could become part of ccu_nkm_ops. But if my code searching skills don't

betray me, there is currently no other clock that implements the terminate

function.

Frank Oltmanns (3):

clk: sunxi-ng: nkm: Minimize difference when finding rate

clk: sunxi-ng: Implement precalculated NKM rate selection

clk: sunxi-ng: sun50i-a64: Precalculate NKM combinations for pll-mipi

drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 281 ++++++++++++++++++++++++++

drivers/clk/sunxi-ng/ccu_mux.c | 2 +-

drivers/clk/sunxi-ng/ccu_nkm.c | 97 +++++++--

drivers/clk/sunxi-ng/ccu_nkm.h | 26 +++

4 files changed, 385 insertions(+), 21 deletions(-)

--

2.40.1

**Next message:**Frank Oltmanns: "[RFC PATCH 2/3] clk: sunxi-ng: Implement precalculated NKM rate selection"**Previous message:**Marc Zyngier: "Re: [Question about gic vmovp cmd] Consider adding VINVALL after VMOVP"**Next in thread:**Frank Oltmanns: "[RFC PATCH 2/3] clk: sunxi-ng: Implement precalculated NKM rate selection"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]