Re: [PATCH v4 3/3] pwm: Add support for Xilinx AXI Timer

From: Uwe Kleine-König
Date: Wed Jun 30 2021 - 04:35:21 EST


Hello Sean,

I often mistype the name of the rounding function as "pwm_round_rate",
the better name is "pwm_round_state" of course. That's just me thinking
about clk_round_rate where ".._rate" is the right term. I'll try harder
to get this right from now on.

On Tue, Jun 29, 2021 at 06:21:15PM -0400, Sean Anderson wrote:
> On 6/29/21 4:51 PM, Uwe Kleine-König wrote:
> > On Tue, Jun 29, 2021 at 02:01:31PM -0400, Sean Anderson wrote:
> > > On 6/29/21 4:31 AM, Uwe Kleine-König wrote:
> > > > On Mon, Jun 28, 2021 at 01:41:43PM -0400, Sean Anderson wrote:
> > > >> On 6/28/21 1:20 PM, Uwe Kleine-König wrote:
> > > >> > On Mon, Jun 28, 2021 at 12:35:19PM -0400, Sean Anderson wrote:
> > > >> >> On 6/28/21 12:24 PM, Uwe Kleine-König wrote:
> > > >> >> > On Mon, Jun 28, 2021 at 11:50:33AM -0400, Sean Anderson wrote:
> > > >> >> > > On 6/27/21 2:19 PM, Uwe Kleine-König wrote:
> > > >> >> > > > On Fri, Jun 25, 2021 at 01:46:26PM -0400, Sean Anderson wrote:
> > > >> >> > > IMO, this is the best way to prevent surprising results in the API.
> > > >> >> >
> > > >> >> > I think it's not possible in practise to refuse "near" misses and every
> > > >> >> > definition of "near" is in some case ridiculous. Also if you consider
> > > >> >> > the pwm_round_state() case you don't want to refuse any request to tell
> > > >> >> > as much as possible about your controller's capabilities. And then it's
> > > >> >> > straight forward to let apply behave in the same way to keep complexity
> > > >> >> > low.
> > > >> >> >
> > > >> >> > > The real issue here is that it is impossible to determine the correct
> > > >> >> > > way to round the PWM a priori, and in particular, without considering
> > > >> >> > > both duty_cycle and period. If a consumer requests very small
> > > >> >> > > period/duty cycle which we cannot produce, how should it be rounded?
> > > >> >> >
> > > >> >> > Yeah, because there is no obviously right one, I picked one that is as
> > > >> >> > wrong as the other possibilities but is easy to work with.
> > > >> >> >
> > > >> >> > > Should we just set TLR0=1 and TLR1=0 to give them 66% duty cycle with
> > > >> >> > > the least period? Or should we try and increase the period to better
> > > >> >> > > approximate the % duty cycle? And both of these decisions must be made
> > > >> >> > > knowing both parameters. We cannot (for example) just always round up,
> > > >> >> > > since we may produce a configuration with TLR0 == TLR1, which would
> > > >> >> > > produce 0% duty cycle instead of whatever was requested. Rounding rate
> > > >> >> > > will introduce significant complexity into the driver. Most of the time
> > > >> >> > > if a consumer requests an invalid rate, it is due to misconfiguration
> > > >> >> > > which is best solved by fixing the configuration.
> > > >> >> >
> > > >> >> > In the first step pick the biggest period not bigger than the requested
> > > >> >> > and then pick the biggest duty cycle that is not bigger than the
> > > >> >> > requested and that can be set with the just picked period. That is the
> > > >> >> > behaviour that all new drivers should do. This is somewhat arbitrary but
> > > >> >> > after quite some thought the most sensible in my eyes.
> > > >> >>
> > > >> >> And if there are no periods smaller than the requested period?
> > > >> >
> > > >> > Then return -ERANGE.
> > > >>
> > > >> Ok, so instead of
> > > >>
> > > >> if (cycles < 2 || cycles > priv->max + 2)
> > > >> return -ERANGE;
> > > >>
> > > >> you would prefer
> > > >>
> > > >> if (cycles < 2)
> > > >> return -ERANGE;
> > > >> else if (cycles > priv->max + 2)
> > > >> cycles = priv->max;
> > > >
> > > > The actual calculation is a bit harder to handle TCSR_UDT = 0 but in
> > > > principle, yes, but see below.
> > > >
> > > >> But if we do the above clamping for TLR0, then we have to recalculate
> > > >> the duty cycle for TLR1. Which I guess means doing something like
> > > >>
> > > >> ret = xilinx_timer_tlr_period(priv, &tlr0, tcsr0, state->period);
> > > >> if (ret)
> > > >> return ret;
> > > >>
> > > >> state->duty_cycle = mult_frac(state->duty_cycle,
> > > >> xilinx_timer_get_period(priv, tlr0, tcsr0),
> > > >> state->period);
> > > >>
> > > >> ret = xilinx_timer_tlr_period(priv, &tlr1, tcsr1, state->duty_cycle);
> > > >> if (ret)
> > > >> return ret;
> > > >
> > > > No, you need something like:
> > > >
> > > > /*
> > > > * The multiplication cannot overflow as both priv_max and
> > > > * NSEC_PER_SEC fit into an u32.
> > > > */
> > > > max_period = div64_ul((u64)priv->max * NSEC_PER_SEC, clkrate);
> > > >
> > > > /* cap period to the maximal possible value */
> > > > if (state->period > max_period)
> > > > period = max_period;
> > > > else
> > > > period = state->period;
> > > >
> > > > /* cap duty_cycle to the maximal possible value */
> > > > if (state->duty_cycle > max_period)
> > > > duty_cycle = max_period;
> > > > else
> > > > duty_cycle = state->duty_cycle;
> > >
> > > These caps may increase the % duty cycle.
> >
> > Correct.
> >
> > For some usecases keeping the relative duty cycle might be better, for
> > others it might not. I'm still convinced that in general my solution
> > makes sense, is computationally cheaper and easier to work with.
>
> Can you please describe one of those use cases? Every PWM user I looked
> (grepping for pwm_apply_state and pwm_config) set the duty cycle as a
> percentage of the period, and not as an absolute time. Keeping the high
> time the same while changing the duty cycle runs contrary to the
> assumptions of all of those users.

Indeed there is no mainline driver that relies on this. There are some
smart LED controllers (e.g. WS2812B) where the duty_cycle is more
important than the period. (I admit a PWM is not really the right driver
for that one as it could only completely enable and complete disable
white color.) Also there are some servo motor chips where the absolute
duty is relevant but the period isn't (in some range). (See
https://www.mikrocontroller.net/articles/Modellbauservo_Ansteuerung#Signalaufbau
for a article about that (in German though).)

In case you want to argue that out-of-mainline users don't count: I
think in the design of an API they do count to place the bar to enter
the mainline low. Frameworks should be generic enough to cover as much
use cases as possible.

And note that if you want a nearest to (say) 50% relative duty cycle and
don't care much about the period it doesn't really matter if you scale
duty_cycle in pwm_round_state() to the period change or not because in
general you need several calls to pwm_round_state() anyhow to find a
setting with 51% if the next lower possibility is 47%. So in the end you
save (I think) one call in generic PWM code.

In contrast the math gets quite a bit more complicated because there is
rounding involved in scaling the duty cycle. Consider a PWM that can
configure period and duty in 16.4 ns steps and you ask for

.period = 100 ns
.duty_cycle = 50 ns

Then the best period you can provide is 98.4 ns, so you return .period =
99 from pwm_round_state(). (Yes, you don't return 98, because
round-nearest is much harder to handle than round down.) To determine
the adapted duty_cycle you have to do

50 * realperiod / 100

which independently of choosing 98, 98.4 or 99 for realperiod is 49. Then
to approximate 49 without rounding up you end up with 32.8 while 49.2
would have be perfectly fine.

You might find a way around that (maybe you have to round up in the
adaption of duty_cycle, I didn't convince myself this is good enough
though).

So your suggestion to adapt the duty_cycle to keep the relative
duty_cycle constant (as good as possible within the bounds the hardware
dictates) implies additional complication at the driver level.

From a framework maintainer's point of view (and also from a low-level
driver maintainer's point of view) I prefer one complication in a
generic function over a complication that I have to care for in each and
every low-level driver by a big margin.

So unless you volunteer to complete the math above and promise to review
low-level drivers for that aspect in the future (or alternatively
convince me that math is easy and I missed something) I would like to
end this discussion here and stay with the policy I explained.

> > > > period_cycles = period * clkrate / NSEC_PER_SEC;
> > > >
> > > > if (period_cycles < 2)
> > > > return -ERANGE;
> > > >
> > > > duty_cycles = duty_cycle * clkrate / NSEC_PER_SEC;
> > > >
> > > > /*
> > > > * The hardware cannot emit a 100% relative duty cycle, if
> > > > * duty_cycle >= period_cycles is programmed the hardware emits
> > > > * a 0% relative duty cycle.
> > > > */
> > > > if (duty_cycle == period_cycles)
> > > > duty_cycles = period_cycles - 1;
> > > >
> > > > /*
> > > > * The hardware cannot emit a duty_cycle of one clk step, so
> > > > * emit 0 instead.
> > > > */
> > > > if (duty_cycles < 2)
> > > > duty_cycles = period_cycles;
> > >
> > > Of course, the above may result in 100% duty cycle being rounded down to
> > > 0%. I feel like that is too big of a jump to ignore. Perhaps if we
> > > cannot return -ERANGE we should at least dev_warn.
> >
> > You did it again. You picked one single case that you consider bad but
> > didn't provide a constructive way to make it better.
>
> Sure I did. I suggested that we warn. Something like
>
> if (duty_cycles == period_cycles)
> if (--duty_cycles < 2)
> dev_warn(chip->dev, "Rounding 100%% duty cycle down to 0%%; pick a longer period\n");
>
> or
>
> if (period_cycles < 2)
> return -ERANGE;
> else if (period_cycles < 10)
> dev_notice(chip->dev,
> "very short period of %u cycles; duty cycle may be rounded to 0%%\n",
> period_cycles);

Ah ok, so only a 100% jump warrants that warning. I think adding that
has no practical relevance, so I don't oppose to that. Add it if you
want. (But note that if it triggers indeed it might flood the kernel log
if your consumer wants to start a motor but notices it doesn't run fast
enough and so configures 100% in a tight loop. So I would recommend some
rate limiting.)

> Because 90% of the time, if a user requests such a short period it is
> due to a typo or something similar. And if they really are doing it
> intentionally, then they should just set duty_cycle=0.

Uh, don't you think that a warning that is wrong in 10% of the cases is
bad?

> > Assume there was already a pwm_round_state function (that returns the
> > state that pwm_apply_state would implement for a given request) Consider
> > a consumer that wants say a 50% relative duty together with a small
> > period. So it first might call:
> >
> > ret = pwm_round_rate(pwm, { .period = 20, .duty_cycle = 20, ... }, &rounded_state)
> >
> > to find out if .period = 20 can be implemented with the given PWM. If
> > this returns rounded state as:
> >
> > .period = 20
> > .duty_cycle = 0
> >
> > this says quite a lot about the pwm if the driver implements my policy.
> > (i.e.: The driver can do 20ns, but the biggest duty_cycle is only 0).
> > If however it returns -ERANGE this means (assuming the driver implements
> > the policy I try to convice you to be the right one) it means: The
> > hardware cannot implement 20 ns (or something smaller) and so the next
> > call probably tries 40 ns.
> >
> > With your suggested semantic -ERANGE might mean:
> >
> > - The driver doesn't support .period = 20 ns
> > (Follow up questions: What period should be tried next? 10 ns? 40
> > ns? What if this returns -ERANGE again?)
> > - The driver supports .period = 20 ns, but the biggest possible
> > duty_cycle is "too different from 20 ns to ignore".
> >
> > Then how should the search continue?
>
> round_rate does not have to use the same logic as apply_state.

I want to have .round_state() and .apply() (i.e. the driver callbacks)
to behave identically. If we indeed come to the conclusion that
pwm_apply_state needs to have some precautions, I'd like to have them
implemented in pwm_apply_state() only and not in every driver.

> However, calling
>
> ret = pwm_apply_state(pwm, { .period = 20, .duty_cycle = 0, ... })
>
> should work just fine, as the caller clearly knows what they are getting
> into. IMO this is the best way to allow hypothetical round_rate users to
> find out the edges of the PWM while still protecting existing users.
>
> It's perfectly fine to round
>
> { .period = 150, .duty_cycle = 75 }
>
> to
>
> { .period = 100, .duty_cycle = 75 }
>
> in round_rate. But doing the same thing for apply_state would be very
> surprising to every existing PWM user.
>
> IMO the following invariant should hold
>
> apply_state(round_rate(x))
> assert(round_rate(x) == get_state())

(merged your correction of the follow up mail into the quote above)

(Fun fact: Only needing this one would allow a generic implementation of
round_state, it just had to return a pwm_state that doesn't depend on x
:o)

> but the following should not necessarily hold
>
> apply_state(x)
> assert(round_rate(x) == get_state())
>
> Of course, where it is reasonable to round down, we should do so.
>
> But where the result may be surprising, then the caller should specify
> the rounded state specifically. It is better to fail loudly and
> noisily than
> to silently accept garbage.

Can you please come up with an algorithm to judge if a given deviation
is reasonable or surprising? I agree there are surprises and some of
them are obviously bad. For most cases however the judgement depends on
the use case so I fail to see how someone should program such a check
that should cover all consumers and use cases. I prefer no precautions +
an easy relation between pwm_round_state and pwm_apply_state (i.e.
behave identically) over a most of the time(?) useless precaution and
some policy defined differences between pwm_round_state and
pwm_apply_state

Best regards
Uwe

--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | https://www.pengutronix.de/ |

Attachment: signature.asc
Description: PGP signature