Re: [PATCH] tty: serial: qcom-geni-serial: minor fixes to get_clk_div_rate()

From: Doug Anderson
Date: Wed Jun 08 2022 - 18:38:44 EST


Hi,

On Wed, Jun 8, 2022 at 11:34 AM Vijaya Krishna Nivarthi
<quic_vnivarth@xxxxxxxxxxx> wrote:
>
> Hi,
>
>
> On 6/8/2022 12:55 AM, Doug Anderson wrote:
> > Hi,
> >
> > On Tue, Jun 7, 2022 at 10:40 AM Vijaya Krishna Nivarthi
> > <quic_vnivarth@xxxxxxxxxxx> wrote:
> >> Hi,
> >>
> >> On 6/7/2022 1:29 AM, Doug Anderson wrote:
> >>
> >> My only concern continues to be...
> >>
> >> Given ser_clk is the final frequency that this function is going to
> >> return and best_div is going to be the clk_divider, is it ok if the
> >> divider cant divide the frequency exactly?
> >>
> >> In other words, Can this function output combinations like (402,4)
> >> (501,5) ?
> >>
> >> If ok, then we can go ahead with this patch or even previous perhaps.
> > I don't see why not. You're basically just getting a resulting clock
> > that's not an integral "Hz", right?
> >
> > So if "baud" is 9600 and sampling_rate is 16 then desired_clk is (9600
> > * 16) = 153600
> >
> > Let's imagine that we do all the math and we finally decide that our
> > best bet is with the rate 922000 and a divider of 6. That means that
> > the actual clock we'll make is 153666.67 when we _wanted_ 153600.
> > There's no reason it needs to be integral, though, and 153666.67 would
> > still be better than making 160000.
> >
> Thank you for clarification.
> >>> power?)
> >> Actually power saving was the anticipation behind returning first
> >> frequency in original patch, when we cant find exact frequency.
> > Right, except that if you just pick the first clock you find it would
> > be _wildly_ off. I guess if you really want to do this the right way,
> > you need to set a maximum tolerance and pick the first rate you find
> > that meets that tolerance. Random web search for "uart baud rate
> > tolerance" makes me believe that +/- 5% deviation is OK, but to be
> > safe you probably want something lower. Maybe 2%? So if the desired
> > clock is within 2% of a clock you can make, can you just pick that
> > one?
> Ok, 2% seems good.
> >
> >>>> Please note that we go past cases when we have an divider that can
> >>>> exactly divide the frequency(105/1, 204/2, 303/3) and end up with one
> >>>> that doesn't.
> >>> Ah, good point. Luckily that's a 1-line fix, right?
> >> Apologies, I could not figure out how.
> > Ah, sorry. Not quite 1 line, but this (untested)
> >
> >
> > freq = clk_round_rate(clk, mult);
> >
> > if (freq % desired_clk == 0) {
> > ser_clk = freq;
> > best_div = freq / desired_clk;
> > break;
> > }
> >
> > candidate_div = max(1, DIV_ROUND_CLOSEST(freq, desired_clk));
> > candidate_freq = freq / candidate_div;
> > diff = abs((long)desired_clk - candidate_freq);
> > if (diff < best_diff) {
> > best_diff = diff;
> > ser_clk = freq;
> > best_div = candidate_div;
> > }
>
> But then once again, we would likely need 2 loops because while we are
> ok with giving up on search for best_div on finding something within 2%
> tolerance, we may not want to give up on exact match (freq % desired_clk
> == 0 )

Ah, it took me a while to understand why two loops. It's because in
one case you're trying multiplies and in the other you're bumping up
to the next closest clock rate. I don't think you really need to do
that. Just test the (rate - 2%) and the rate. How about this (only
lightly tested):

ser_clk = 0;
maxdiv = CLK_DIV_MSK >> CLK_DIV_SHFT;
div = 1;
while (div < maxdiv) {
mult = (unsigned long long)div * desired_clk;
if (mult != (unsigned long)mult)
break;

two_percent = mult / 50;

/*
* Loop requesting (freq - 2%) and possibly (freq).
*
* We'll keep track of the lowest freq inexact match we found
* but always try to find a perfect match. NOTE: this algorithm
* could miss a slightly better freq if there's more than one
* freq between (freq - 2%) and (freq) but (freq) can't be made
* exactly, but that's OK.
*
* This absolutely relies on the fact that the Qualcomm clock
* driver always rounds up.
*/
test_freq = mult - two_percent;
while (test_freq <= mult) {
freq = clk_round_rate(clk, test_freq);

/*
* A dead-on freq is an insta-win. This implicitly
* handles when "freq == mult"
*/
if (!(freq % desired_clk)) {
*clk_div = freq / desired_clk;
return freq;
}

/*
* Only time clock framework doesn't round up is if
* we're past the max clock rate. We're done searching
* if that's the case.
*/
if (freq < test_freq)
return ser_clk;

/* Save the first (lowest freq) within 2% */
if (!ser_clk && freq <= mult + two_percent) {
ser_clk = freq;
*clk_div = div;
}

/*
* If we already rounded up past mult then this will
* cause the loop to exit. If not then this will run
* the loop a second time with exactly mult.
*/
test_freq = max(freq + 1, mult);
}

/*
* test_freq will always be bigger than mult by at least 1.
* That means we can get the next divider with a DIV_ROUND_UP.
* This has the advantage of skipping by a whole bunch of divs
* If the clock framework already bypassed them.
*/
div = DIV_ROUND_UP(test_freq, desired_clk);
}

return ser_clk;