Re: fix int_sqrt() for very large numbers

From: Crt Mori
Date: Sun Jan 20 2019 - 03:32:30 EST


On Sun, 20 Jan 2019 at 04:49, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sun, Jan 20, 2019 at 12:01 PM Will Deacon <will.deacon@xxxxxxx> wrote:
> >
> > > @@ -52,7 +52,7 @@ u32 int_sqrt64(u64 x)
> > > if (x <= ULONG_MAX)
> > > return int_sqrt((unsigned long) x);
> > >
> > > - m = 1ULL << (fls64(x) & ~1ULL);
> > > + m = 1ULL << ((fls64(x) - 1) & ~1ULL);
> >
> > This just looks like a copy-paste error because there isn't an __fls64().
> > But I think your suggestion here is ok, given the previous check against
> > ULONG_MAX.
>
> Hmm. We probably *should* add a __fls64().
>
> There looks to be only one user of int_sqrt64(), and that one is
> confused. It does int_sqrt64() twice, but since the inner one will
> reduce the range to 32 bits, the outer one is just silly.

II have a usecase (mlx90632) where this calculation worked on arm64
(nexus), but not in normal 32-bit arm (beaglebone). I have tried going
with full u64 to u64, but I was persuaded that it is not necessary and
testing on black body (sensor range from 0 - 80 degrees) confirmed
that for my calculations u32/u64 is enough. Because of the testing
range (and keep in mind it is casted to signed after two sqrts) the
high bit might never affect my end result, but I needed precision, not
the range. Inside the function the b was 32bit on 32bit core, but I
needed it to be 64bit. To keep it similar to existing int_sqrt, I have
decided to just type all variables there to 64bit.

We have implementation of this with doubles (see datasheet) and I
ported it to integer on arm64. The end result was fairly similar
calculation (for within object tempearture range from 0-80), between
both.

> That one user also had better not be overflowing into the high bit -
> it uses "s64" as a type and does seem to use signed operatons, so high
> bit set really means negative. sqrt() returning something odd for a
> negative number wouldn't be all that odd in that context.
>
> But yes, our current int_sqrt64() does seem buggy as-is, because it's
> *supposed* to work on u64's, even if I don't think we really have any
> users that care.

I introduced strong types for existing int_sqrt implementation to keep
it aligned between 64bit and 32bit.

Best regards,
Crt

> And as Will mentioned, the regular int_sqrt() looks perfectly fine,
> and subtracting 1 from the __fls() return value would actually
> _introduce_ a bug.
>
> Linus