Re: [PATCH] speed up on find_first_bit for i386 (let compiler dothe work)

From: David Woodhouse
Date: Fri Jul 29 2005 - 05:11:33 EST


On Thu, 2005-07-28 at 10:25 -0700, Linus Torvalds wrote:
> Basic rule: inline assembly is _better_ than random compiler extensions.
> It's better to have _one_ well-documented extension that is very generic
> than it is to have a thousand specialized extensions.

Counterexample: FR-V and its __builtin_read8() et al. For FR-V you have
to issue a memory barrier before or after certain I/O instructions, but
in some circumstances you can omit them. The compiler knows this and can
omit the membar instructions as appropriate -- but doing the same
optimisations in inline assembly would be fairly much impossible.

Builtins can also allow the compiler more visibility into what's going
on and more opportunity to optimise. They can also set condition
registers, which you can't do from inline assembly -- if you want to
perform a test in inline asm, you have to put the result in a register
and then test the contents of that register. (You can't just branch from
the inline asm either, although we used to try).

Builtins are more portable and their implementation will improve to
match developments in the target CPU. Inline assembly, as we have seen,
remains the same for years while the technology moves on.

Although it's often the case that inline assembly _is_ better,
especially in code which is arch-specific in the first place, I wouldn't
necessarily assume that it's always the case.

--
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/