RE: [PATCH v10 1/3] lib: Add strongly typed 64bit int_sqrt

From: David Laight
Date: Wed Dec 20 2017 - 11:46:22 EST


From: Crt Mori
> Sent: 20 December 2017 16:17
>
> On 20 December 2017 at 17:00, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Wed, Dec 20, 2017 at 02:39:26PM +0000, David Laight wrote:
> >
> >> With minor changes it ought to be possible to remove most of the
> >> 64bit arithmetic and shifts.
> >>
> >> If you care about performance then using 32 bit maths will be much faster.
> >
> > Some, u64 add/sub/shift isn't exactly expensive, but yes, I also
> > indicated that improvement is possible. At the very least y can be made
> > a u32 I suppose.
>
> OK, is there any more easy optimizations you see?

I think this version works.
It doesn't have the optimisation for small values.

unsigned int sqrt64(unsigned long long x)
{
unsigned int x_hi = x >> 32;

unsigned int b = 0;
unsigned int y = 0;
unsigned int i;

for (i = 0; i < 32; i++) {
b <<= 2;
b |= x_hi >> 30;
x_hi <<= 2;
if (i == 15)
x_hi = x;
y <<= 1;
if (b > y)
b -= ++y;
}
return y;
}

Put it through cc -O3 -m32 -c -o sqrt64.o sqrt64.c and then objdump sqrt64.o
and compare to that of your version.

David