Re: [PATCH v4 3/3] pwm: Add support for Xilinx AXI Timer

From: Sean Anderson
Date: Tue Jun 29 2021 - 14:01:45 EST




On 6/29/21 4:31 AM, Uwe Kleine-König wrote:
> Hello Sean,
>
> On Mon, Jun 28, 2021 at 01:41:43PM -0400, Sean Anderson wrote:
>> On 6/28/21 1:20 PM, Uwe Kleine-König wrote:
>> > On Mon, Jun 28, 2021 at 12:35:19PM -0400, Sean Anderson wrote:
>> >> On 6/28/21 12:24 PM, Uwe Kleine-König wrote:
>> >> > On Mon, Jun 28, 2021 at 11:50:33AM -0400, Sean Anderson wrote:
>> >> > > On 6/27/21 2:19 PM, Uwe Kleine-König wrote:
>> >> > > > On Fri, Jun 25, 2021 at 01:46:26PM -0400, Sean Anderson wrote:
>> >> > > > > So for the moment, why not give an error? This will be legal code both
>> >> > > > > now and after round_state is implemented.
>> >> > > >
>> >> > > > The problem is where to draw the line. To stay with your example: If a
>> >> > > > request for period = 150 ns comes in, and let X be the biggest period <=
>> >> > > > 150 ns that the hardware can configure. For which values of X should an
>> >> > > > error be returned and for which values the setting should be
>> >> > > > implemented.
>> >> > > >
>> >> > > > In my eyes the only sensible thing to implement here is to tell the
>> >> > > > consumer about X and let it decide if it's good enough. If you have a
>> >> > > > better idea let me hear about it.
>> >> > >
>> >> > > Sure. And I think it's ok to tell the consumer that X is the best we can
>> >> > > do. But if they go along and request an unconfigurable state anyway, we
>> >> > > should tell them as much.
>> >> >
>> >> > I have the impression you didn't understand where I see the problem. If
>> >> > you request 150 ns and the controller can only do 149 ns (or 149.6667 ns)
>> >> > should we refuse? If yes: This is very unusable, e.g. the led-pwm driver
>> >> > expects that it can configure the duty_cycle in 1/256 steps of the
>> >> > period, and then maybe only steps 27 and 213 of the 256 possible steps
>> >> > work. (This example doesn't really match because the led-pwm driver
>> >> > varies duty_cycle and not period, but the principle becomes clear I
>> >> > assume.) If no: Should we accept 151 ns? Isn't that ridiculous?
>> >>
>> >> I am fine with this sort of rounding. The part I take issue with is when
>> >> the consumer requests (e.g.) a 10ns period, but the best we can do is
>> >> 20ns. Or at the other end if they request a 4s period but the best we
>> >> can do is 2s. Here, there is no obvious way to round it, so I think we
>> >> should just say "come back with a reasonable period" and let whoever
>> >> wrote the device tree pick a better period.
>> >
>> > Note that giving ridiculus examples is easy, but this doesn't help to
>> > actually implement something sensible. Please tell us for your example
>> > where the driver can only implement 20 ns what is the smallest requested
>> > period the driver should accept.
>>
>> 20ns :)
>>
>> In the case of this device, that would result in 0% duty cycle with a
>> 100MHz input. So the smallest reasonable period is 30ns with a duty
>> cycle of 20ns.
>
> I took the time to understand the hardware a bit better, also to be able
> to reply to your formulae below. So to recap (and simplify slightly
> assuming TCSR_UDT = 1):
>
>
> TLR0 + 2
> period = --------
> clkrate
>
> TLR1 + 2
> duty_cycle = -------- if TLR1 < TLR0, else 0
> clkrate
>
>
> where TLRx has the range [0..0xffffffff] (for some devices the range is
> smaller). So clkrate seems to be 100 MHz?

On my system, yes.

>
>> >> > > IMO, this is the best way to prevent surprising results in the API.
>> >> >
>> >> > I think it's not possible in practise to refuse "near" misses and every
>> >> > definition of "near" is in some case ridiculous. Also if you consider
>> >> > the pwm_round_state() case you don't want to refuse any request to tell
>> >> > as much as possible about your controller's capabilities. And then it's
>> >> > straight forward to let apply behave in the same way to keep complexity
>> >> > low.
>> >> >
>> >> > > The real issue here is that it is impossible to determine the correct
>> >> > > way to round the PWM a priori, and in particular, without considering
>> >> > > both duty_cycle and period. If a consumer requests very small
>> >> > > period/duty cycle which we cannot produce, how should it be rounded?
>> >> >
>> >> > Yeah, because there is no obviously right one, I picked one that is as
>> >> > wrong as the other possibilities but is easy to work with.
>> >> >
>> >> > > Should we just set TLR0=1 and TLR1=0 to give them 66% duty cycle with
>> >> > > the least period? Or should we try and increase the period to better
>> >> > > approximate the % duty cycle? And both of these decisions must be made
>> >> > > knowing both parameters. We cannot (for example) just always round up,
>> >> > > since we may produce a configuration with TLR0 == TLR1, which would
>> >> > > produce 0% duty cycle instead of whatever was requested. Rounding rate
>> >> > > will introduce significant complexity into the driver. Most of the time
>> >> > > if a consumer requests an invalid rate, it is due to misconfiguration
>> >> > > which is best solved by fixing the configuration.
>> >> >
>> >> > In the first step pick the biggest period not bigger than the requested
>> >> > and then pick the biggest duty cycle that is not bigger than the
>> >> > requested and that can be set with the just picked period. That is the
>> >> > behaviour that all new drivers should do. This is somewhat arbitrary but
>> >> > after quite some thought the most sensible in my eyes.
>> >>
>> >> And if there are no periods smaller than the requested period?
>> >
>> > Then return -ERANGE.
>>
>> Ok, so instead of
>>
>> if (cycles < 2 || cycles > priv->max + 2)
>> return -ERANGE;
>>
>> you would prefer
>>
>> if (cycles < 2)
>> return -ERANGE;
>> else if (cycles > priv->max + 2)
>> cycles = priv->max;
>
> The actual calculation is a bit harder to handle TCSR_UDT = 0 but in
> principle, yes, but see below.
>
>> But if we do the above clamping for TLR0, then we have to recalculate
>> the duty cycle for TLR1. Which I guess means doing something like
>>
>> ret = xilinx_timer_tlr_period(priv, &tlr0, tcsr0, state->period);
>> if (ret)
>> return ret;
>>
>> state->duty_cycle = mult_frac(state->duty_cycle,
>> xilinx_timer_get_period(priv, tlr0, tcsr0),
>> state->period);
>>
>> ret = xilinx_timer_tlr_period(priv, &tlr1, tcsr1, state->duty_cycle);
>> if (ret)
>> return ret;
>
> No, you need something like:
>
> /*
> * The multiplication cannot overflow as both priv_max and
> * NSEC_PER_SEC fit into an u32.
> */
> max_period = div64_ul((u64)priv->max * NSEC_PER_SEC, clkrate);
>
> /* cap period to the maximal possible value */
> if (state->period > max_period)
> period = max_period;
> else
> period = state->period;
>
> /* cap duty_cycle to the maximal possible value */
> if (state->duty_cycle > max_period)
> duty_cycle = max_period;
> else
> duty_cycle = state->duty_cycle;

These caps may increase the % duty cycle. For example, consider when the
max is 100, and the user requests a period of 150 and a duty cycle of
75, for a % duty cycle of 50%. The current logic is equivalent to

period = min(state->period, max_period);
duty_cycle = min(state->duty_cycle, max_period);

Which will result in a period of 100 and a duty cycle of 75, for a 75%
duty cycle. Instead, we should do

period = min(state->period, max_period);
duty_cycle = mult_frac(state->duty_cycle, period, state->period);

which will result in a period of 100 and a duty cycle of 50.

> period_cycles = period * clkrate / NSEC_PER_SEC;
>
> if (period_cycles < 2)
> return -ERANGE;
>
> duty_cycles = duty_cycle * clkrate / NSEC_PER_SEC;
>
> /*
> * The hardware cannot emit a 100% relative duty cycle, if
> * duty_cycle >= period_cycles is programmed the hardware emits
> * a 0% relative duty cycle.
> */
> if (duty_cycle == period_cycles)
> duty_cycles = period_cycles - 1;
>
> /*
> * The hardware cannot emit a duty_cycle of one clk step, so
> * emit 0 instead.
> */
> if (duty_cycles < 2)
> duty_cycles = period_cycles;

Of course, the above may result in 100% duty cycle being rounded down to
0%. I feel like that is too big of a jump to ignore. Perhaps if we
cannot return -ERANGE we should at least dev_warn.

--Sean

>> >> > > > > Perhaps I should add
>> >> > > > >
>> >> > > > > if (tlr0 <= tlr1)
>> >> > > > > return -EINVAL;
>> >> > > > >
>> >> > > > > here to prevent accidentally getting 0% duty cycle.
>> >> > > >
>> >> > > > You can assume that duty_cycle <= period when .apply is called.
>> >> > >
>> >> > > Ok, I will only check for == then.
>> >> >
>> >> > You just have to pay attention to the case that you had to decrement
>> >> > .period to the next possible value. Then .duty_cycle might be bigger
>> >> > than the corrected period.
>> >>
>> >> This is specifically to prevent 100% duty cycle from turning into 0%. My
>> >> current draft is
>> >>
>> >> /*
>> >> * If TLR0 == TLR1, then we will produce 0% duty cycle instead of 100%
>> >> * duty cycle. Try and reduce the high time to compensate. If we can't
>> >> * do that because the high time is already 0 cycles, then just error
>> >> * out.
>> >> */
>> >> if (tlr0 == tlr1 && !tlr1--)
>> >> return -EINVAL;
>> >
>> > If you follow my suggested policy this isn't an error and you should
>> > yield the biggest duty_cycle here even if it is zero.
>>
>> So like this?
>>
>> if (tlr0 == tlr1) {
>> if (tlr1)
>> tlr1--;
>> else if (tlr0 != priv->max)
>> tlr0++;
>> else
>> return -ERANGE;
>> }
>
> No, this is wrong as it configures a longer period than requested in
> some cases.
>
>> And I would really appreciate if you could write up some documentation
>> with common errors and how to handle them. It's not at all obvious to me
>> what all the implications of the above guidelines are.
>
> Yes, I fully agree this should be documented and doing that is on my
> todo list. Until I come around to do this, enabling PWM_DEBUG should
> help you getting this right (assuming you test extensively and read the
> resulting kernel messages).
>
> Best regards
> Uwe
>