Re: [PATCH v2] string_helpers: fix precision loss for some inputs

From: Rasmus Villemoes
Date: Tue Nov 03 2015 - 18:26:55 EST


On Tue, Nov 03 2015, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 2015-11-03 at 23:13 +0100, Rasmus Villemoes wrote:
>> On Tue, Nov 03 2015, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> > From: James Bottomley <JBottomley@xxxxxxxx>
>> >
>> > It was noticed that we lose precision in the final calculation for some
>> > inputs. The most egregious example is size=3000 blk_size=1900 in units of 10
>> > should yield 5.70 MB but in fact yields 3.00 MB (oops). This is because the
>> > current algorithm doesn't correctly account for all the remainders in the
>> > logarithms. Fix this by doing a correct calculation in the remainders based
>> > on napier's algorithm. Additionally, now we have the correct result, we have
>> > to account for arithmetic rounding because we're printing 3 digits of
>> > precision. This means that if the fourth digit is five or greater, we have to
>> > round up, so add a section to ensure correct rounding. Finally account for
>> > all possible inputs correctly, including zero for block size.
>> >
>> > Reported-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>> > Cc: stable@xxxxxxxxxxxxxxx # delay backport by two months for testing
>> > Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
>> > Signed-off-by: James Bottomley <JBottomley@xxxxxxxx>
>> >
>> > --
>> >
>> > v2: updated with a recommendation from Rasmus Villemoes to truncate the
>> > initial precision at just under 32 bits
>> >
>> > diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> > index 5939f63..363faca 100644
>> > --- a/lib/string_helpers.c
>> > +++ b/lib/string_helpers.c
>> > @@ -43,38 +43,40 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>> > [STRING_UNITS_10] = 1000,
>> > [STRING_UNITS_2] = 1024,
>> > };
>> > - int i, j;
>> > - u32 remainder = 0, sf_cap, exp;
>> > + static const unsigned int rounding[] = { 500, 50, 5, 0};
>>
>> j necessarily ends up being 0, 1 or 2. Any reason to include the last entry?
>
> No reason beyond a vague worry someone might try to increase the printed
> precision by one digit.

But that would seem to require prepending 5000 to that array and
changing various constants below to 10000 (aside from checking all
callers to see if they pass a sufficient buffer size) - the 0 doesn't
serve any purpose in that scenario either.

>> > +
>> > + while (blk_size >= UINT_MAX)
>> > i++;
>> > - }
>> >
>> > - exp = divisor[units] / (u32)blk_size;
>> > - /*
>> > - * size must be strictly greater than exp here to ensure that remainder
>> > - * is greater than divisor[units] coming out of the if below.
>> > - */
>> > - if (size > exp) {
>> > - remainder = do_div(size, divisor[units]);
>> > - remainder *= blk_size;
>> > + while (size >= UINT_MAX)
>> > i++;
>>
>> Please spell it U32_MAX
>
> Why? there's no reason not to use the arithmetic UINT_MAX here. Either
> works, of course but UINT_MAX is standard.

We're dealing with explicitly sized integers, and the comment even says
that we're reducing till we fit in 32 bits, so that we can do a
32x32->64 multiplication. U32_MAX is the natural name for the
appropriate constant.

Also, you could do > U32_MAX instead of >= U32_MAX, but that's unlikely
to make any difference (well, except it might generate slightly better
code, since it would allow gcc to just test the upper half for being 0,
which might be cheaper on some architectures than comparing to a
literal).

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/