Re: [PATCH -tip 1/2] x86/hweight: Fix false output register dependency of POPCNT insn

From: Borislav Petkov
Date: Tue Mar 25 2025 - 13:10:41 EST


On Tue, Mar 25, 2025 at 05:48:37PM +0100, Uros Bizjak wrote:
> +/*
> + * On Sandy/Ivy Bridge and later Intel processors, the POPCNT instruction
> + * appears to have a false dependency on the destination register. Even
> + * though the instruction only writes to it, the instruction will wait
> + * until destination is ready before executing. This false dependency
> + * was fixed for Cannon Lake (and later) processors.

Any official documentation about that?

Any performance numbers to justify that change?

Because if it doesn't matter, why do it in the first place? Especially if
you're doing this XORing now for *everyone* - not just the affected parties.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette