Re: [PATCH] Make div64_u64() precise on 32bit platforms

From: Oleg Nesterov
Date: Wed Oct 13 2010 - 17:42:19 EST


On 10/12, Brian Behlendorf wrote:
>
> I'm resending the patch as is and adding what I hope are the right CCs. Also
> let me explain why I opted to add abs64() and use the gcc builtin.
>
> >Can't we just improve abs? Say,
>
> I was reluctant to change abs() since it would have a much larger impact on
> the code base. Using typeof() should be OK but if any of the callers
> mistakenly call abs() with an unsigned value then we could see compiler
> warnings about '__x < 0' being a useless conditional.

I see. Probably in this case we want this warning. But I agree, it is better
to make a separate patch for such a change.

> >This is a bit unusual. I mean, it is not that common to use gcc builtins
> >in the normal code. And, it seems, we can use __fls(divisor >> 32) or
> >just fls64() instead ?
>
> I opted for the gcc builtin because I felt it made the code more readable. I
> also suspect it will perform slightly better than __fls() on some archs.

Well, compared to div_64() we are going to do, this is nothing. But
I won't argue.


I think the patch is correct. A couple of questions though,

> + * 'http://www.hackersdelight.org/HDcode/newCode/divDouble.c'

404

> u64 div64_u64(u64 dividend, u64 divisor)
> {
> - u32 high, d;
> -
> - high = divisor >> 32;
> - if (high) {
> - unsigned int shift = fls(high);
> + u64 u0, quot0, quot1;
> + u32 rem;
> + int n;
> +
> + if (divisor >> 32 == 0) {
> + if (dividend >> 32 < divisor) {
> + return div_u64_rem(dividend, divisor, &rem);
> + } else {
> + u0 = dividend & 0xFFFFFFFF;
> + quot1 = div_u64_rem(dividend >> 32, divisor, &rem);
> + u0 += ((u64)rem << 32);
> + quot0 = div_u64_rem(u0, divisor, &rem);
> + return (quot1 << 32) + quot0;
> + }

Looks correct... but I can't understand these complications.
Looks like we can just do

if ((divisor >> 32) == 0) {
div_u64(dividend, divisor);
} else {
...

No?

> + } else {
> + n = __builtin_clzll(divisor);
> + quot1 = div_u64_rem(dividend >> 1, (divisor << n) >> 32, &rem);
> + quot0 = (quot1 << n) >> 31;

I can't understand this "dividend >> 1". It seems to me that

quot1 = div_u64(dividend, (divisor << n) >> 32);
quot0 = (quot1 << n) >> 32;

should be equally correct. Or I missed some overflow?


Anyway this looks correct, but I almost died trying to understand this
code (or, better say, to convince myself I can understand it ;)

Looks like, if we denote

A = dividend
B = divisor
K = 1ull << (32 - n)

then
quot0 = A / (B - (B % K))

which is obviously >= A/B. All we need is to ensure is that it is
<= A/B + 1, and this seems to be true.

So, I believe the patch is correct.



A bit off-topic,

> uu = tabu[i];
> vu = tabu[j];
> qu = div64_u64(uu, vu);
> ru = uu - qu * vu;
> if (qu > uu || ru >= vu) {
> printk("%016llx/%016llx != %016llx "
> "rem %016llx\n", uu, vu, qu, ru);
> errors++;
> }

I wouldn't trust this check too much. I mean, it can miss an error.
For example, consider something like

vu = -1ll
uu = vu / 2
qu = 2 // suppose that div64_u64() is very wrong

Afaics, this wrong qu passes the check.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/