Re: [patch 5/6] [RFD] timekeeping: Provide optional 128bit math

From: Peter Zijlstra
Date: Fri Dec 09 2016 - 05:01:48 EST


On Fri, Dec 09, 2016 at 09:30:11AM +0100, Peter Zijlstra wrote:
> +static inline u64 mul_u32_u32(u32 a, u32 b)
> +{
> + u64 ret;
> +
> + asm ("mull %[b]" : "=A" (ret) : [a] "a" (a), [b] "g" (b) );
> +
> + return ret;
> +}

ARGH, that's broken on x86_64, it needs to be:

u32 high, low;

asm ("mull %[b]" : "=a" (low), "=d" (high)
: [a] "a" (a), [b] "g" (b) );

return low | ((u64)high) << 32;

The 'A' constraint doesn't work right.

And with that all the benchmark results are borken too.



root@ivb-ep:~/spinlocks# for i in -m64 -m32 -mx32 ; do echo $i; gcc -O3 $i -o mult mult.c -lm; ./mult; done

-m64
cond: avg: 7.474872 +- 0.008302
uncond: avg: 9.116401 +- 0.008468
128: avg: 0.826584 +- 0.005514

-m32
cond: avg: 16.604030 +- 0.009808
uncond: avg: 13.115470 +- 0.004452

-mx32
cond: avg: 6.168156 +- 0.006650
uncond: avg: 7.202092 +- 0.006813
128: avg: 0.081809 +- 0.008440