Re: [PATCH v3 2/2] pwm: sprd: Add Spreadtrum PWM support

From: Baolin Wang
Date: Thu Aug 15 2019 - 07:06:08 EST


On Thu, 15 Aug 2019 at 18:11, Uwe Kleine-KÃnig
<u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> On Thu, Aug 15, 2019 at 05:34:02PM +0800, Baolin Wang wrote:
> > On Thu, 15 Aug 2019 at 16:54, Uwe Kleine-KÃnig
> > <u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
> > > On Thu, Aug 15, 2019 at 04:16:32PM +0800, Baolin Wang wrote:
> > > > On Thu, 15 Aug 2019 at 14:15, Uwe Kleine-KÃnig
> > > > <u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
> > > > > On Thu, Aug 15, 2019 at 11:34:27AM +0800, Baolin Wang wrote:
> > > > > > On Wed, 14 Aug 2019 at 23:03, Uwe Kleine-KÃnig
> > > > > > <u.kleine-koenig@xxxxxxxxxxxxxx> wrote:
> > > > > > > On Wed, Aug 14, 2019 at 08:46:11PM +0800, Baolin Wang wrote:
> > > > > > > > + /*
> > > > > > > > + * The hardware provides a counter that is feed by the source clock.
> > > > > > > > + * The period length is (PRESCALE + 1) * MOD counter steps.
> > > > > > > > + * The duty cycle length is (PRESCALE + 1) * DUTY counter steps.
> > > > > > > > + * Thus the period_ns and duty_ns calculation formula should be:
> > > > > > > > + * period_ns = NSEC_PER_SEC * (prescale + 1) * mod / clk_rate
> > > > > > > > + * duty_ns = NSEC_PER_SEC * (prescale + 1) * duty / clk_rate
> > > > > > > > + */
> > > > > > > > + val = sprd_pwm_read(spc, pwm->hwpwm, SPRD_PWM_PRESCALE);
> > > > > > > > + prescale = val & SPRD_PWM_PRESCALE_MSK;
> > > > > > > > + tmp = (prescale + 1) * NSEC_PER_SEC * SPRD_PWM_MOD_MAX;
> > > > > > > > + state->period = DIV_ROUND_CLOSEST_ULL(tmp, chn->clk_rate);
> > > > > > > > +
> > > > > > > > + val = sprd_pwm_read(spc, pwm->hwpwm, SPRD_PWM_DUTY);
> > > > > > > > + duty = val & SPRD_PWM_DUTY_MSK;
> > > > > > > > + tmp = (prescale + 1) * NSEC_PER_SEC * duty;
> > > > > > > > + state->duty_cycle = DIV_ROUND_CLOSEST_ULL(tmp, chn->clk_rate);
> > > > > > > > +
> > > > > > > > + /* Disable PWM clocks if the PWM channel is not in enable state. */
> > > > > > > > + if (!state->enabled)
> > > > > > > > + clk_bulk_disable_unprepare(SPRD_PWM_CHN_CLKS_NUM, chn->clks);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static int sprd_pwm_config(struct sprd_pwm_chip *spc, struct pwm_device *pwm,
> > > > > > > > + int duty_ns, int period_ns)
> > > > > > > > +{
> > > > > > > > + struct sprd_pwm_chn *chn = &spc->chn[pwm->hwpwm];
> > > > > > > > + u32 prescale, duty;
> > > > > > > > + u64 tmp;
> > > > > > > > +
> > > > > > > > + /*
> > > > > > > > + * The hardware provides a counter that is feed by the source clock.
> > > > > > > > + * The period length is (PRESCALE + 1) * MOD counter steps.
> > > > > > > > + * The duty cycle length is (PRESCALE + 1) * DUTY counter steps.
> > > > > > > > + *
> > > > > > > > + * To keep the maths simple we're always using MOD = SPRD_PWM_MOD_MAX.
> > > > > > >
> > > > > > > Did you spend some thoughts about how wrong your period can get because
> > > > > > > of that "lazyness"?
> > > > > > >
> > > > > > > Let's assume a clk rate of 100/3 MHz. Then the available period lengths
> > > > > > > are:
> > > > > > >
> > > > > > > PRESCALE = 0 -> period = 7.65 Âs
> > > > > > > PRESCALE = 1 -> period = 15.30 Âs
> > > > > > > ...
> > > > > > > PRESCALE = 17 -> period = 137.70 Âs
> > > > > > > PRESCALE = 18 -> period = 145.35 Âs
> > > > > > >
> > > > > > > So the error can be up to (nearly) 7.65 Âs (or in general
> > > > > >
> > > > > > Yes, but for our use case (pwm backlight), the precision can meet our
> > > > > > requirement. Moreover, we usually do not change the period, just
> > > > > > adjust the duty to change the back light.
> > > > >
> > > > > Is this a license requirement for you SoC to only drive a backlight with
> > > > > the PWM? The idea of having a PWM driver on your platform is that it can
> > > > > also be used to control a step motor or a laser.
> > > >
> > > > Not a license requirement. Until now we have not got any higher
> > > > precision requirements, and we've run this driver for many years in
> > > > our downstream kernel.
> > >
> > > I understood that you're not ambitious to do something better than "it
> > > worked for years".
> >
> > How do you know that?
>
> I showed you how you could match the requested PWM output better and
> you refused telling it worked for years and the added precision isn't
> necessary for a backlight.

Please I said the reason, it is not that I do not want a better
precision. The problem is we do not know how much precision to be
asked by users if no use case, and if improve the precision, which
means we should not keep MOD as a constant and suitable value, we
should do more mathematics to get a suitable MOD and DUTY, but there
is no necessary for now.

> > If there are some cases expect a higher precision, then we can analyze
> > how precision asked by the user, then we have a goal to improve it,
> > even improve the hardware. But now, I said there are no these use
> > cases, why I should add more mathematics to increase load and
> > complication.
> >
> > > > > > > PRESCALE = 18 and MOD = 254 you get a period of 144.78 Âs and so the
> > > > > > > error is only 0.56 Âs which is a factor of 13 better.
> > > > > > >
> > > > > > > Hmm.
> > > > > > >
> > > > > > > > + * The value for PRESCALE is selected such that the resulting period
> > > > > > > > + * gets the maximal length not bigger than the requested one with the
> > > > > > > > + * given settings (MOD = SPRD_PWM_MOD_MAX and input clock).
> > > > > > > > + */
> > > > > > > > + duty = duty_ns * SPRD_PWM_MOD_MAX / period_ns;
> > > > > > >
> > > > > > > I wonder if you loose some precision here as you use period_ns but might
> > > > > > > actually implement a shorter period.
> > > > > > >
> > > > > > > Quick example, again consider clk_rate = 100 / 3 MHz,
> > > > > > > period_ns = 145340, duty_ns = 72670. Then you end up with
> > > > > > >
> > > > > > > PRESCALE = 17
> > > > > > > MOD = 255
> > > > > > > DUTY = 127
> > > > > >
> > > > > > Incorrect, we will get PRESCALE = 18, MOD = 255, DUTY = 127.
> > > > > >
> > > > > > > That corresponds to period_ns = 137700, duty_ns = 68580. With DUTY = 134
> > > > > > > you get 72360 ns which is still smaller than the requested 72670 ns.
> > > > > >
> > > > > > Incorrect, with DUTY = 134 (PRESCALE = 18 -> period = 145.35 Âs),
> > > > > > duty_ns = 76380ns
> > > > >
> > > > > Yes, as above. When using rounding-closest your error is not in [0, 7.65
> > > > > Âs] but in [-3.825 Âs, 3.825 Âs]. Doesn't make it better.
> > > >
> > > > Actually our use case really dose not care about this error.
> > >
> > > I assume that Thierry will apply your patch anyhow. But be prepared that
> > > you get a patch from me then to improve precision. It would be a waste
> > > of resources not to do that after doing all the necessary math already.
> >
> > Glad to see your improvement without introducing complicated and more
> > mathematics.
>
> I don't understand you. Either you or me will improve the precision. The
> maths is the same for both cases. I would prefer you do it, otherwise I
> will have the problem later that I must get you to invest the time to
> test or I'd have to argue the change to go in untested.
>
> > > > > all lowlevel drivers. If you do this now I won't bother you later when
> > > > > the requirement is implemented in your driver. And the comment helps
> > > > > someone who evaluates your SoC to judge if there is still work to do if
> > > > > they have higher requirements for the PWM.
> > > >
> > > > So what you asked is something like below, right?
> > > > diff --git a/drivers/pwm/pwm-sprd.c b/drivers/pwm/pwm-sprd.c
> > > > index 96f8aa0..1d3db94 100644
> > > > --- a/drivers/pwm/pwm-sprd.c
> > > > +++ b/drivers/pwm/pwm-sprd.c
> > > > @@ -103,12 +103,12 @@ static void sprd_pwm_get_state(struct pwm_chip
> > > > *chip, struct pwm_device *pwm,
> > > > val = sprd_pwm_read(spc, pwm->hwpwm, SPRD_PWM_PRESCALE);
> > > > prescale = val & SPRD_PWM_PRESCALE_MSK;
> > > > tmp = (prescale + 1) * NSEC_PER_SEC * SPRD_PWM_MOD_MAX;
> > > > - state->period = DIV_ROUND_CLOSEST_ULL(tmp, chn->clk_rate);
> > > > + state->period = DIV_ROUND_UP_ULL(tmp, chn->clk_rate);
> > > >
> > > > val = sprd_pwm_read(spc, pwm->hwpwm, SPRD_PWM_DUTY);
> > > > duty = val & SPRD_PWM_DUTY_MSK;
> > > > tmp = (prescale + 1) * NSEC_PER_SEC * duty;
> > > > - state->duty_cycle = DIV_ROUND_CLOSEST_ULL(tmp, chn->clk_rate);
> > > > + state->duty_cycle = DIV_ROUND_UP_ULL(tmp, chn->clk_rate);
> > > >
> > > > /* Disable PWM clocks if the PWM channel is not in enable state. */
> > > > if (!state->enabled)
> > > > @@ -135,8 +135,8 @@ static int sprd_pwm_config(struct sprd_pwm_chip
> > > > *spc, struct pwm_device *pwm,
> > > > duty = duty_ns * SPRD_PWM_MOD_MAX / period_ns;
> > > >
> > > > tmp = (u64)chn->clk_rate * period_ns;
> > > > - do_div(tmp, NSEC_PER_SEC);
> > > > - prescale = DIV_ROUND_CLOSEST_ULL(tmp, SPRD_PWM_MOD_MAX) - 1;
> > > > + div = 1000000000ULL * SPRD_PWM_MOD_MAX;
> > > > + prescale = div64_u64(tmp, div) - 1;
> > > > if (prescale > SPRD_PWM_PRESCALE_MSK)
> > > > prescale = SPRD_PWM_PRESCALE_MSK;
> > >
> > > This goes in the right direction for sure.
> > >
> > > Without taking paper and pencil I wouldn't be surprised if the
> > > calculation of duty_cycle in .get_state didn't match the calculation of
> > > DUTY in .apply yet though.
> > >
> > > > But our MOD is constant, it did not help to improve the precision.
> > > > Instead, like you said, when period_ns = 145340, we will set PRESCALE
> > > > = 17, so in .get_state(), user will get period_ns = 137700 (error
> > > > is145340 - 137700).
> > > >
> > > > But if we use DIV_ROUND_CLOSEST, in .get_state(), user will get
> > > > period_ns = 145350 (error is 145350 - 145340).
> > >
> > > In this case DIV_ROUND_CLOSEST seems to get nearer to the requested
> > > value than when rounding down. But this example was constructed to show
> > > your original algorithm to be bad, and just because you modify your
> > > algorithm to perform better on that constructed example doesn't imply
> > > the new one is better. Moreover you implement a bigger period than
> > > requested which is something I intend to forbid in the future.
> > >
> > > And note that with PWMs there is no "objective" metric that can tell you
> > > which of two implementable outputs better match a given request. It
> > > depends on the use case, so the best we can do is to tell our users our
> > > metric and with that in mind they can create a request that then fits
> > > their needs.
> >
> > Yes, that should be asked by the use case, some cases do not care a
> > little bigger period than requested.
>
> So for some cases it is beneficial to be predictable and for other it
> isn't. So the only safe thing to do for a lowlevel driver is to be
> predictable always because it cannot (and shouldn't) tell if the current
> request is one of cases where precision matters.
>
> > As you said, what you asked did not get a consensus yet, so I'd like
> > to wait for Thierry's suggestion.
> >
> > > > > > > twice instead of once before. (I don't know what architecture your SoC
> > > > > > > uses, but compared to a multiplication a division is usually expensive.)
> > > > > > > Also the math is more complicated now as you have a round-down div and a
> > > > > > > round-closest div.
> > > > > > >
> > > > > > > My preference for how to fix that is to restore the behaviour of v2 that
> > > > > > > matches the comment and adapt .get_state() instead.
> > > > > >
> > > > > > Using DIV_ROUND_CLOSEST_ULL can get a same prescale which matches with
> > > > > > .get_state().
> > > > >
> > > > > I don't get you here. Do you say that with DIV_ROUND_CLOSEST_ULL you get
> > > > > the same result but DIV_ROUND_CLOSEST_ULL matches .get_state while
> > > > > rounding down doesn't? I cannot follow.
> > > >
> > > > Yes, that's what I mean.
> > >
> > > But that is logically broken. If both approaches yield the same
> > > results it cannot be true that exactly one of them matches the inverse
> > > of .get_state.
> >
> > What I mean is use DIV_ROUND_CLOSEST_ULL we can get a nearer value to
> > the requested like above example.
>
> But given that it's unclear if 137700 ns or 145350 ns is better when
> 145340 ns was requested this is not a strong argument to use
> DIV_ROUND_CLOSEST_ULL. With the global picture for the pwm framework in
> mind it is sensible to request the same rounding from all drivers to get
> a consistent behaviour. And I believe the maths with rounding down is
> easier than when rounding up or nearest. That's why I argue in this
> direction.

Let's wait for Thierry's suggestion to get a consensus firstly.

--
Baolin Wang
Best Regards