Re: [PATCH 46/74] x86, lto: Disable fancy hweightoptimizations for LTO

From: Jan Beulich
Date: Mon Aug 20 2012 - 06:57:11 EST


>>> On 19.08.12 at 17:15, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
>> >--- a/arch/x86/include/asm/arch_hweight.h
>> >+++ b/arch/x86/include/asm/arch_hweight.h
>> >@@ -25,9 +25,14 @@ static inline unsigned int __arch_hweight32(unsigned int w)
>> >{
>> > unsigned int res = 0;
>> >
>> >+#ifdef CONFIG_LTO
>> >+ res = __sw_hweight32(w);
>> >+#else
>> >+
>> > asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT)
>> > : "="REG_OUT (res)
>> > : REG_IN (w));
>> >+#endif
>>
>> Isn't this a little to harsh? Rather than not using popcnt at all, why don't
>> you just add the necessary clobbers to the asm() in the LTO case?
>
> gcc lacks the means to declare that a asm uses an external symbol
> currently. Ok we could make it visible. But there's no way to make the
> special calling convention work anyways, at least not without someone
> changing gcc to allow to declare this per function.

That's not the point: The point really is that you could allow the
alternative regardless of LTO, and just penalize the LTO case
by having even the asm clobber the registers that a function call
would not preserve.

> I'm not sure the optimization is really worth it anyways, hweight should
> be uncommon.

That's a separate question (but I sort of agree - not sure whether
CPU mask weights ever get calculated on hot paths).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/