Re: Bad gcc code for atomic_dec

Linus Torvalds (torvalds@transmeta.com)
11 Jan 1998 17:09:37 GMT


In article <19980111102546.3725.qmail@mail.ocs.com.au>,
Keith Owens <kaos@ocs.com.au> wrote:
>gcc 2.7.2.3 for i586 generates an extra load for atomic_dec(&xxx). It
>outputs
>
> movl xxx,%ebx
> lock decl xxx
>
>and never uses the value in %ebx, wasting a register. Strangely
>enough, atomic_inc is fine, just "lock incl xxx".

It probably depends on the code around the thing - it shouldn't be a
difference between inc/dec (as far as gcc is concerned, the patterns for
inc and dec are exactly the same, the string it outputs is just
different). Gcc tends to be slightly confused by the inline assembly,
and in this particular case one of the reasons is that I had to use a
input/output constraint that looks like

:"=m" (__atomic_fool_gcc(v))
:"m" (__atomic_fool_gcc(v)));

which makes gcc think that there are _two_ separate accesses. Due to
that, some common subexpression logic kicks in and makes for slightly
messy code.

The correct constraint would actually be

:"=m" (__atomic_fool_gcc(v))
:"0" (__atomic_fool_gcc(v)));

to tell gcc that the input is the _same_ as the output, but sadly gcc
can't take this kind of constraint reliably (the "same" constraint works
reliably only for registers, not for memory accesses - exact behaviour
depends a bit on the version of gcc in question and the complexity of
the code around the inline asm, but essentially the "same" constraint
for memory accesses can make gcc abort with "impossible restraint"
errors).

The gcc "md" files can actually take a "+" in the constraint for a
operand that is both read and written to, but that doesn't work in
inline assembly. Pity - it would also work fine in this case.

Oh, well.. I've grown used to inline assembly not generating perfect
code, but it tends to generate close enough to what I want that I can't
really complain.

Linus