Re: [PATCH v4] media: rc: pwm-ir-tx: Switch to atomic PWM API

From: Uwe Kleine-König
Date: Thu Oct 28 2021 - 14:05:34 EST


On Thu, Oct 28, 2021 at 01:26:10PM +0100, Sean Young wrote:
> Hi Uwe,
>
> On Thu, Oct 28, 2021 at 01:15:35PM +0200, Uwe Kleine-König wrote:
> > On Thu, Oct 28, 2021 at 10:14:42AM +0100, Sean Young wrote:
> > > On Thu, Oct 28, 2021 at 08:45:13AM +0200, Uwe Kleine-König wrote:
> > > > The conversion is right (I think),
> > >
> > > We still have the problem that the pwm drivers calculate the period
> > > incorrectly by rounding down (except pwm-bcm2835). So the period is not
> > > as good as it could be in most cases, but this driver can't do anything
> > > about that.
> >
> > Yeah, some time ago I started coding a round_state function
> > (wip at
> > https://git.pengutronix.de/cgit/ukl/linux/commit/?h=pwm-wip&id=ae348eb6a55d6526f30ef4a49819197d9616391e)
> > but this was pushed down on my todo-list by more important stuff.
>
> That looks great, thank you for working on that!
>
> > If you want to experiment with that ...
>
> I will have a look.
>
> > > > note this could be optimized a bit
> > > > further: state.period only depends on carrier which rarely changes, so
> > > > the calculation could be done in pwm_ir_set_carrier(). Ditto for duty
> > > > which only depends on state.period and pwm_ir->duty_cycle. (This is for
> > > > a separate commit though.)
> > >
> > > I'm not sure what caching this is much of a win. The calculation is a few
> > > instructions, so you're not winning in the way of speed. On the flip side
> > > you use more memory since pwm_state has to be kmalloc() rather than existing
> >
> > I tested a bit with this patch on top of Maíra's:
> >
> > diff --git a/drivers/media/rc/pwm-ir-tx.c b/drivers/media/rc/pwm-ir-tx.c
> > index 105a9c24f1e3..7585c21775bc 100644
> > --- a/drivers/media/rc/pwm-ir-tx.c
> > +++ b/drivers/media/rc/pwm-ir-tx.c
> > @@ -17,7 +17,7 @@
> >
> > struct pwm_ir {
> > struct pwm_device *pwm;
> > - unsigned int carrier;
> > + struct pwm_state state;
> > unsigned int duty_cycle;
> > };
> >
> > @@ -32,6 +32,7 @@ static int pwm_ir_set_duty_cycle(struct rc_dev *dev, u32 duty_cycle)
> > struct pwm_ir *pwm_ir = dev->priv;
> >
> > pwm_ir->duty_cycle = duty_cycle;
> > + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100);
> >
> > return 0;
> > }
> > @@ -43,7 +44,8 @@ static int pwm_ir_set_carrier(struct rc_dev *dev, u32 carrier)
> > if (!carrier)
> > return -EINVAL;
> >
> > - pwm_ir->carrier = carrier;
> > + pwm_ir->state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, carrier);
> > + pwm_set_relative_duty_cycle(&pwm_ir->state, pwm_ir->duty_cycle, 100);
> >
> > return 0;
> > }
> > @@ -53,21 +55,15 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf,
> > {
> > struct pwm_ir *pwm_ir = dev->priv;
> > struct pwm_device *pwm = pwm_ir->pwm;
> > - struct pwm_state state;
> > int i;
> > ktime_t edge;
> > long delta;
> >
> > - pwm_init_state(pwm, &state);
> > -
> > - state.period = DIV_ROUND_CLOSEST(NSEC_PER_SEC, pwm_ir->carrier);
> > - pwm_set_relative_duty_cycle(&state, pwm_ir->duty_cycle, 100);
> > -
> > edge = ktime_get();
> >
> > for (i = 0; i < count; i++) {
> > - state.enabled = !(i % 2);
> > - pwm_apply_state(pwm, &state);
> > + pwm_ir->state.enabled = !(i % 2);
> > + pwm_apply_state(pwm, &pwm_ir->state);
> >
> > edge = ktime_add_us(edge, txbuf[i]);
> > delta = ktime_us_delta(edge, ktime_get());
> > @@ -75,8 +71,8 @@ static int pwm_ir_tx(struct rc_dev *dev, unsigned int *txbuf,
> > usleep_range(delta, delta + 10);
> > }
> >
> > - state.enabled = false;
> > - pwm_apply_state(pwm, &state);
> > + pwm_ir->state.enabled = false;
> > + pwm_apply_state(pwm, &pwm_ir->state);
> >
> > return count;
> > }
> > @@ -95,8 +91,9 @@ static int pwm_ir_probe(struct platform_device *pdev)
> > if (IS_ERR(pwm_ir->pwm))
> > return PTR_ERR(pwm_ir->pwm);
> >
> > - pwm_ir->carrier = 38000;
> > - pwm_ir->duty_cycle = 50;
> > + pwm_ir->state.duty_cycle = 50;
> > + pwm_init_state(pwm_ir->pwm, &pwm_ir->state);
> > + pwm_ir_set_carrier(rcdev, 38000);
> >
> > rcdev = devm_rc_allocate_device(&pdev->dev, RC_DRIVER_IR_RAW_TX);
> > if (!rcdev)
> >
> > bloat-o-meter reports (for an arm allmodconfig build)
> >
> > add/remove: 0/0 grow/shrink: 3/1 up/down: 644/-396 (248)
> > Function old new delta
> > pwm_ir_probe 372 676 +304
> > pwm_ir_set_carrier 108 292 +184
> > pwm_ir_set_duty_cycle 68 224 +156
> > pwm_ir_tx 908 512 -396
> > Total: Before=2302, After=2550, chg +10.77%
>
> So 248 bytes more after your changes.

ack. This is because the compiler inlines the division which accounts
for > 100 bytes.

> > struct pwm_ir increases from 12 bytes to 40 bytes.
> >
> > The stack space required by pwm_ir_tx decreases from 60 to 36
> >
> > I don't know exactly how kmalloc works internally. Maybe allocating a
> > structure of size 40 bytes doesn't need more memory than a structure of
> > size 12?
> >
> > I didn't check how runtimes change, but the size decrease of pwm_ir_tx()
> > is nice and might save a bit of runtime.
>
> I'm not following, how is this decreasing runtime?

With my changes pwm_ir_tx got smaller and { pwm_ir_probe,
pwm_ir_set_carrier, pwm_ir_set_duty_cycle } got bigger. Now if for a
typical runtime pattern pwm_ir_probe and pwm_ir_set_carrier run once and
pwm_ir_set_duty_cycle 100 times and pwm_ir_tx 1000 times (no idea if
that is realistic) it might be a net win in sum.

Best regards
Uwe

--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | https://www.pengutronix.de/ |

Attachment: signature.asc
Description: PGP signature