Re: [PATCH -tip 2/2] x86/hweight: Use POPCNT when available with X86_NATIVE_CPU option

From: Uros Bizjak
Date: Sun Mar 30 2025 - 11:15:38 EST


On Tue, Mar 25, 2025 at 6:11 PM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Tue, Mar 25, 2025 at 05:48:38PM +0100, Uros Bizjak wrote:
> > +#ifdef __POPCNT__
> > + asm_inline (ASM_FORCE_CLR "popcntl %[val], %[cnt]"
> > + : [cnt] "=&r" (res)
> > + : [val] ASM_INPUT_RM (w));
> > +#else
> > asm_inline (ALTERNATIVE(ANNOTATE_IGNORE_ALTERNATIVE
> > "call __sw_hweight32",
> > ASM_CLR "popcntl %[val], %[cnt]",
> > X86_FEATURE_POPCNT)
> > : [cnt] "=a" (res), ASM_CALL_CONSTRAINT
> > : [val] REG_IN (w));
> > -
> > +#endif
>
> A whopping 599 bytes which makes the asm more ugly.
>
> Not worth the effort IMO.

You missed this part:

--q--
... where there is no need for an entry in the .altinstr_replacement
section, shrinking all text sections by 9476 bytes:

text data bss dec hex filename
27267068 4643047 814852 32724967 1f357e7 vmlinux-old.o
27257592 4643047 814852 32715491 1f332e3 vmlinux-new.o
--/q--

Thanks,
Uros.