Re: [GIT PULL] optimize 64-by-32 ddivision for constant divisors on 32-bit machines

From: Nicolas Pitre
Date: Tue Nov 24 2015 - 00:37:26 EST


On Mon, 23 Nov 2015, Arnd Bergmann wrote:

> On Monday 23 November 2015 11:04:33 Nicolas Pitre wrote:
> >
> > OK... I'm able to "fix" the build with:
> >
> > diff --git a/include/asm-generic/div64.h b/include/asm-generic/div64.h
> > index 163f77999e..d246c4c801 100644
> > --- a/include/asm-generic/div64.h
> > +++ b/include/asm-generic/div64.h
> > @@ -206,7 +206,7 @@ extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor);
> > uint32_t __rem; \
> > (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \
> > if (__builtin_constant_p(__base) && \
> > - is_power_of_2(__base)) { \
> > + is_power_of_2(__base) && __base != 0) { \
> > __rem = (n) & (__base - 1); \
> > (n) >>= ilog2(__base); \
> > } else if (__div64_const32_is_OK && \
> >
> > What doesn't make sense to me is the fact that is_power_of_2() is
> > defined as:
> >
> > static inline __attribute__((const))
> > bool is_power_of_2(unsigned long n)
> > {
> > return (n != 0 && ((n & (n - 1)) == 0));
> > }
> >
> > So the test for zero is already in there.
> >
> > And adding BUILD_BUG_ON(__builtin_constant_p(__base) && __base == 0)
> > before the if doesn't trig either.
>
> I've seen similarly messed up situations with PROFILE_ALL_BRANCHES
> before, I think it's got something to do with how __builtin_constant_p()
> is used inside of the __trace_if() macro, and how gcc sometimes falls
> back to treating variables as not-really-constant based on context.
>
> To gcc, __builtin_constant_p is just best-effort, and they don't care
> about returning false sometimes if they catch most cases in practice.

OK... I produced a minimal test case. I think gcc is screwed. And it is
not a question of __builtin_constant_p being best effort. The resulting
code is plain wrong where a variable is suddenly turned into a constant
of value 0. Any random modification to various part of the code just
makes it disappear but I didn't check the disassembly to see if it is
then correct.

And the good news(tm) is that the same bug is triggered with x86 gcc as
well.

Please try the attached test case.


Nicolas

Attachment: gcc_testcase.tgz
Description: application/gzip