Re: [PATCH v7 08/13] iio: afe: rescale: fix precision on fractional log scale

From: Peter Rosin
Date: Mon Aug 02 2021 - 05:17:10 EST


On 2021-08-01 21:39, Liam Beguin wrote:
> From: Liam Beguin <lvb@xxxxxxxxxx>
>
> The IIO_VAL_FRACTIONAL_LOG2 scale type doesn't return the expected
> scale. Update the case so that the rescaler returns a fractional type
> and a more precise scale.
>
> Signed-off-by: Liam Beguin <lvb@xxxxxxxxxx>
> ---
> drivers/iio/afe/iio-rescale.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/iio/afe/iio-rescale.c b/drivers/iio/afe/iio-rescale.c
> index abd7ad73d1ce..e37a9766080c 100644
> --- a/drivers/iio/afe/iio-rescale.c
> +++ b/drivers/iio/afe/iio-rescale.c
> @@ -47,12 +47,17 @@ int rescale_process_scale(struct rescale *rescale, int scale_type,
> *val2 = rescale->denominator;
> return IIO_VAL_FRACTIONAL;
> case IIO_VAL_FRACTIONAL_LOG2:
> - tmp = *val * 1000000000LL;
> - do_div(tmp, rescale->denominator);
> - tmp *= rescale->numerator;
> - do_div(tmp, 1000000000LL);
> + if (check_mul_overflow(*val, rescale->numerator, (s32 *)&tmp) ||
> + check_mul_overflow(rescale->denominator, (1 << *val2), (s32 *)&tmp2)) {
> + tmp = (s64)*val * rescale->numerator;
> + tmp2 = (s64)rescale->denominator * (1 << *val2);
> + factor = gcd(abs(tmp), abs(tmp2));
> + tmp = div_s64(tmp, factor);
> + tmp2 = div_s64(tmp2, factor);

The case I really worry about is when trying to get an exact result by using
gcd() really doesn't improve the situation, and the only way to avoid overflow
is to reduce the precision. A perhaps contrived example:

scale numerator 1,220,703,125 i.e. 5 ^ 13
scale denominator 1,162,261,467 i.e. 3 ^ 19
*val 1,129,900,996 i.e. 7 ^ 10 * 2 ^ 2
*val2 2 i.e. value = 7 ^ 10

Then you get overflow for both the calls to check_mul_overflow(). But when gcd()
returns 1 (or something too small) the overflow is "returned" as-is.

With the old code you get something that is at least not completely wrong, just
not as accurate as is perhaps possible:
*val 1,186,715,480
*val2 2
Or 1,186,715,480 / 2^2 = 296,678,870.

With this patch the above makes you attempt to return the fraction:
*val 1,379,273,676,757,812,500
*val2 4,649,045,868
Or 296,678,870.443403528 (or something like that, not 100% sure about all the
fractional digits, but they are not really important for my argument)

While the latter is more correct, truncation to 32-bit clobbers the result so
in reality this is returned:
*val -281,918,188
*val2 354,078,572
Or -0.796202341

So, while it might seem unlucky that gcd() will not find a big enough factor,
it is certainly possible. And I also worry that when this happens it will only
happen once in a while, and that the resulting bad values might be extremely
unexpected and difficult to track down. Things that happen once in a blue moon
are simply not fun to debug.

I.e. I worry that small islands of input will cause failures. With the old code
there are no such islands. The scale factor alone determines the precision, and
if you get poor precision you get poor precision throughout the range. And any
problem will therefore be "stable" and much easier to debug for "innocent" 3rd
party users that may not even be aware that the rescaler is involved at all.

This is also an issue I have with patch 7/13, but there the only thing that is
sacrificed is CPU cycles. But nonetheless, I'm dubious if patch 7/13 is wise
precisely because it might cause issues that are intermittent and therefore
difficult to debug.

Also, changing the calculation so that you get more precision whenever that is
possible feels dangerous. I fear linearity breaks and that bigger input cause
smaller output due to rounding if the bigger value has to be rounded down, but
that this isn't done carefully enough. I.e. attempting to return an exact
fraction and only falling back to the old code when that is not possible is
still not safe since the old code isn't careful enough about rounding. I think
it is really important that bigger input cause bigger (or equal) output.
Otherwise you might trigger instability in feedback loops should a rescaler be
involved in a some regulator function.

Cheers,
Peter

> + }
> *val = tmp;
> - return scale_type;
> + *val2 = tmp2;
> + return IIO_VAL_FRACTIONAL;
> case IIO_VAL_INT_PLUS_NANO:
> case IIO_VAL_INT_PLUS_MICRO:
> if (scale_type == IIO_VAL_INT_PLUS_NANO)
>