Re: [PATCH v3 1/4] lib/string_helpers: change blk_size to u32 for string_get_size() interface

From: Vitaly Kuznetsov
Date: Mon Nov 02 2015 - 10:58:23 EST


James Bottomley <jbottomley@xxxxxxxx> writes:

> On Fri, 2015-10-30 at 11:46 +0100, Vitaly Kuznetsov wrote:
>> James Bottomley <jbottomley@xxxxxxxx> writes:
>>
>> > On Thu, 2015-10-29 at 17:30 +0100, Vitaly Kuznetsov wrote:
>> >> string_get_size() can't really handle huge block sizes, especially
>> >> blk_size > U32_MAX but string_get_size() interface states the opposite.
>> >> Change blk_size from u64 to u32 to reflect the reality.
>> >
>> > What is the actual evidence for this? The calculation is designed to be
>> > a symmetric 128 bit multiply. When I wrote and tested it, it worked
>> > fine for huge block sizes.
>>
>> We have 'u32 remainder' and then we do:
>>
>> exp = divisor[units] / (u32)blk_size;
>> ...
>> remainder = do_div(size, divisor[units]);
>> remainder *= blk_size;
>>
>> I'm pretty sure it will overflow for some inputs.
>
> It shouldn't; the full code snippet does this:
>
> while (blk_size >= divisor[units]) {
> remainder = do_div(blk_size, divisor[units]);
> i++;
> }
>
> exp = divisor[units] / (u32)blk_size;
>
> So by the time it reaches the statement you complain about, blk_size is
> already less than or equal to the divisor (which is 1000 or 1024) so
> truncating to 32 bits is always correct.
>

I overlooked, sorry!

> I'm sort of getting the impression you don't quite understand the
> mathematics: i is the logarithm to the base divisor[units]. We reduce
> both operands to exponents of the logarithm base (adding the two bases
> together in i), which means they are by definition in a range between
> zero and the base and then multiply the remaining exponents correcting
> the result for a base overflow (so the result is always a correct
> exponent and i is the logarithm to the base). It's actually simply
> Napier's algorithm.
>
> The reason we're getting the up to 2.5% rounding errors you complain
> about is because at each logarithm until the last one, we throw away the
> remainder (it's legitimate because it's always 1000x smaller than the
> exponent), but in the case of a large remainder it provides a small
> correction to the final operation which we don't account for. If you
> want to make a true correction, you save the penultimate residue in each
> case, multiply each by the *other* exponent add them together, divide by
> the base and increment the final result by the remainder.

My assumption was that we don't really need to support blk_sizes >
U32_MAX and we can simplify string_get_size() instead of adding
additional complexity. Apparently, the assumption was wrong.

>
> However, for 2.5% the physicist in me says the above is way overkill.
>

It is getting was over 2.5% if blk_size is not a power of 2. While it is
probably never the case for block subsystem the function is in lib and
pretends to be general-enough. I'll try to make proper correction and
let's see if it's worth the effort.

Thanks,

> James
>
>> >
>> > James
>> >
>> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>> >> ---
>> >> include/linux/string_helpers.h | 2 +-
>> >> lib/string_helpers.c | 4 ++--
>> >> 2 files changed, 3 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/include/linux/string_helpers.h b/include/linux/string_helpers.h
>> >> index dabe643..1223e80 100644
>> >> --- a/include/linux/string_helpers.h
>> >> +++ b/include/linux/string_helpers.h
>> >> @@ -10,7 +10,7 @@ enum string_size_units {
>> >> STRING_UNITS_2, /* use binary powers of 2^10 */
>> >> };
>> >>
>> >> -void string_get_size(u64 size, u64 blk_size, enum string_size_units units,
>> >> +void string_get_size(u64 size, u32 blk_size, enum string_size_units units,
>> >> char *buf, int len);
>> >>
>> >> #define UNESCAPE_SPACE 0x01
>> >> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> >> index 5939f63..f6c27dc 100644
>> >> --- a/lib/string_helpers.c
>> >> +++ b/lib/string_helpers.c
>> >> @@ -26,7 +26,7 @@
>> >> * at least 9 bytes and will always be zero terminated.
>> >> *
>> >> */
>> >> -void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>> >> +void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> >> char *buf, int len)
>> >> {
>> >> static const char *const units_10[] = {
>> >> @@ -58,7 +58,7 @@ void string_get_size(u64 size, u64 blk_size, const enum string_size_units units,
>> >> i++;
>> >> }
>> >>
>> >> - exp = divisor[units] / (u32)blk_size;
>> >> + exp = divisor[units] / blk_size;
>> >> /*
>> >> * size must be strictly greater than exp here to ensure that remainder
>> >> * is greater than divisor[units] coming out of the if below.
>>

--
Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/