Re: egcs 1.0.1 miscompiles Linux 2.0.33

Linus Torvalds (torvalds@transmeta.com)
Wed, 25 Feb 1998 21:21:59 -0800 (PST)


On 25 Feb 1998, Andi Kleen wrote:
>
> Last time a similar problem came up (with pgcc) it was attributed to
> some bad contraints in asm-i386/string.h. When I remember it right Linus
> argued at this time that gcc wasn't following its own documented
> behaviour so that he didn't intent to fix it in the kernel.

Right. As far as I can tell, Linux does completely legal asm statements.
And here "legal" isn't some arbitrary decision by me, but according to all
the documentation and the obvious meaning of things.

> Maybe Linus can comment himself on the issue?

I will.

> Afaik the ioport.c problem (which doesn't occur on egcs 1.0.x because
> it doesn't do the ADDRESSOF optimization yet)

Note that the ADDRESSOF optimization is a completely valid one, and when I
heard that gcc was breaking that particular thing I was very happy - it
meant only that gcc was getting more clever. I knew that code was fragile
when I wrote it, but it happened to work.

2.1.x already has another way of handling iopl() which is actually much
better than the broken one anyway. And if gcc optimizes that away I'll
consider it a bug. Back-porting that one to 2.0.x is very straightforward.

So I'd like to state again that I've never considered the ADDRESSOF
optimization a gcc bug, and that once I heard gcc was clever enough to
optimize the thing away I immediately patched the current 2.1.x kernel.

> and this string.h problem
> that bites with some drivers are the only known problems that cause the
> Linux kernel to be miscompiled by egcs/gcc 2.8.0

But the string.h one is definitely a gcc bug, and nobody has convinced me
otherwise.

> Here is a message from Gabriel Paubert with an analysis:
>
> ------------- please bite here ------------- bitte hier abbeissen ---------
>
>
> Sorry if this message is long, but I think it is _very_ important. And
> although it is probably more kernel related than gcc related, it is
> also interesting to all linux-gcc readers IMHO.
>
> All started with the following exception report:

[ removed for brevity ]

> After analysing I came to the conclusion that it occured in a strstr inlined
> function. Oliver used pgcc to compile 2.0.30 and I could not reproduce the
> problem with standard gcc. However the strange thing was the content
> of %edi: "DE43" in ASCII. The pointed to value instead of the pointer !
>
> Then after rambling through GCC doc (I've learnt a lot today :-)), I came
> to the conclusion that this is either a bug in pgcc or, more likely
> bad constraints in asm-i386/string.h. Let us have a look at the strstr
> asm statement:
>
> __asm__ __volatile__(
> "cld\n\t" \
> "movl %4,%%edi\n\t"
> "repne\n\t"
> "scasb\n\t"
> "notl %%ecx\n\t"
> "decl %%ecx\n\t" /* NOTE! This also sets Z if searchstring='' */
> "movl %%ecx,%%edx\n"
> "1:\tmovl %4,%%edi\n\t"
> "movl %%esi,%%eax\n\t"
> "movl %%edx,%%ecx\n\t"
> "repe\n\t"
> "cmpsb\n\t"
> "je 2f\n\t" /* also works for empty string, see above */
> "xchgl %%eax,%%esi\n\t"
> "incl %%esi\n\t"
> "cmpb $0,-1(%%eax)\n\t"
> "jne 1b\n\t"
> "xorl %%eax,%%eax\n\t"
> "2:"
> :"=a" (__res):"0" (0),"c" (0xffffffff),"S" (cs),"g" (ct)
> :"cx","dx","di","si");
> return __res;
> }
>
> If I correctly understand GCC doc, nothing prevents gcc from using %edi
> and %edx to compute argument %4. So I sent the following patch appended
> at the end which tried to prevent this by allocating %edi as an input
> parameter and used two additional scratch variables allocated by the
> compiler as modified and used ("=&" constraints) parameters.

Incorrect. Gcc documentation _explicitly_ states that when you mark a
register clobbered ("cx","dx","di","si" in this case: %edi is definitely
on the list), then that register will NOT be used either for inputs of for
outputs, and that the inline assembly can read and write it multiple
times.

The fact that gcc uses the register for address computations is in clear
violation of the documentation that states that you can read and write the
register many times. QED.

This is not to say that we can't work around problems like this, but I get
_very_ irritated when people claim that they are kernel bugs when the gcc
documentation is clearly of a different opinion (and the _only_ sane
reason for having a clobber-list in the first place is to mark registers
that are used as temporaries in the asm statement, so _obviously_ using
them for address arguments inside the asm is broken - I can't believe that
anybody can try to explain that away).

The other known gcc-2.8.0 bug was that gcc-2.8.0 will move asm statements
that have no outputs around, even though (again) the gcc documentation
clearly states that an asm statement with no outputs is considered to be
"volatile". Again, this wasn't a question of interpretation or anything
like that - the documentation very clearly and explicitly states that.

In short, I think gcc should be fixed, or at the very least it should be a
_documented_ bug at which point I have no gripe with fixing it in the
kernel. But I refuse to work around undocumented bugs - that starts to
smell too much like microsoft, and then I might as well start using
Windows myself.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu