Re: [PATCH] lib/int_sqrt.c: Optimize square root function
From: Linus Torvalds
Date: Thu Jul 20 2017 - 19:24:41 EST
On Thu, Jul 20, 2017 at 3:34 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>
>> "Why does this matter, and what is the range it matters for?"
>
> I was looking to do some work on the idle estimator. Parts of that keep
> online avg and variance for normal distributions. I wanted to bias the
> avg downwards, the way to do that is to subtract a scaled stdev from it.
> Computing the stdev from a variance requires the sqrt.
>
> In any case, I suppose the range of values would be in the order of
> TICK_NSEC, so the variance would be a number of orders below that. So
> we're looking at fairly small numbers <1e5.
Ok. So that already cuts down the problem space a lot.
And yes, the current code is very suboptimal for exactly the small number case.
It's also typiocally the case that right-shifting is more expensive
than left-shifting, so the fact that we left-shift twice in that loop
for all the big values is extra expensive.
So I would actually suggest a different approach:
- start with a small 'm'
- and grow it up.
The "small m" means that there is a small constant, which is good.
And growing it up is a single trivial left shift by two, which can
also be done just two adds or as a single lea, so it works fine even
on architectures with slow shifters etc.
So mayube something like the attached?
NOTE NOTE NOTE! This may be pure garbage. It's untested. But the code
generation looks ok even if gcc is being way too clever and turns that
first loop into a counted loop instead. Whatever, maybe it's the right
thing to do.
But note that I might have broken the sqrt for some case, so you need
to double-check my logic there. The *intention* is that it's the exact
same thing as it used to do, just initializing 'm' differently.
Linus
lib/int_sqrt.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c
index 1ef4cc344977..da3b3dabad8e 100644
--- a/lib/int_sqrt.c
+++ b/lib/int_sqrt.c
@@ -16,14 +16,22 @@
*/
unsigned long int_sqrt(unsigned long x)
{
- unsigned long b, m, y = 0;
+ unsigned long m, y;
if (x <= 1)
return x;
- m = 1UL << (BITS_PER_LONG - 2);
- while (m != 0) {
- b = y + m;
+ m = 64;
+ do {
+ unsigned long new_m = m << 2;
+ if (!new_m)
+ break;
+ m = new_m;
+ } while (m < x);
+
+ y = 0;
+ do {
+ unsigned long b = y + m;
y >>= 1;
if (x >= b) {
@@ -31,7 +39,7 @@ unsigned long int_sqrt(unsigned long x)
y += m;
}
m >>= 2;
- }
+ } while (m);
return y;
}