Re: [PATCH 2/5] bitops: compile time optimization forhweight_long(CONSTANT)

From: Borislav Petkov
Date: Sun Feb 14 2010 - 06:25:03 EST


On Sun, Feb 14, 2010 at 11:12:23AM +0100, Peter Zijlstra wrote:
> On Thu, 2010-02-11 at 18:24 +0100, Borislav Petkov wrote:
> > On Mon, Feb 08, 2010 at 10:59:45AM +0100, Borislav Petkov wrote:
> > > Let me prep another version when I get back on Wed. (currently
> > > travelling) with all the stuff we discussed to see how it would turn.
> >
> > Ok, here's another version ontop of PeterZ's patch at
> > http://lkml.org/lkml/2010/2/4/119. I need to handle 32- and 64-bit
> > differently wrt to popcnt opcode so on 32-bit I do "popcnt %eax, %eax"
> > while on 64-bit I do "popcnt %rdi, %rdi".
>
> Right, so I don't like how you need to touch !x86 for this, and I think
> that is easily avoidable by not making x86 include
> asm-generic/bitops/arch_hweight.h.
>
> If you then add __sw_hweightN() -> __arch_hweightN() wrappers in
> arch_hweight.h, then you can leave const_hweight.h use __arch_hweightN()
> and simply provide __arch_hweightN() from x86/include/asm/bitops.h

Hmm, all these different names start to get a little confusing. Can we first
agree on the naming please, here's my proposal:

__const_hweightN - for at compile time known constants as arguments
__arch_hweightN - arch possibly has an optimized hweight version
__sw_hweightN - fall back when nothing else is there, aka the functions in
lib/hweight.c

Now, in the x86 case, when the compiler can't know that the argument is
a constant, we call the __arch_hweightN versions. The alternative does
call the __sw_hweightN version in case the CPU doesn't support popcnt.
In this case, we need to build __sw_hweightN with -fcall-saved* for gcc
to be able to take care of the regs clobbered ny __sw_hweightN.

So, if I understand you correctly, your suggestion might work, we
simply need to rename the lib/hweight.c versions to __sw_hweightN
and have <asm-generic/bitops/arch_hweight.h> have __arch_hweightN ->
__sw_hweightN wrappers in the default case, all arches which have an
optimized version will provide it in their respective bitops header...

Hows that?

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/