Re: [PATCH -tip 1/2] x86/hweight: Fix false output register dependency of POPCNT insn
From: Ingo Molnar
Date: Tue Mar 25 2025 - 17:50:58 EST
* Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> On Sandy/Ivy Bridge and later Intel processors, the POPCNT instruction
> appears to have a false dependency on the destination register. Even
> though the instruction only writes to it, the instruction will wait
> until destination is ready before executing. This false dependency
> was fixed for Cannon Lake (and later) processors.
>
> Fix false dependency by clearing the destination register first.
>
> The x86_64 defconfig object size increases by 779 bytes:
>
> text data bss dec hex filename
> 27341418 4643015 814852 32799285 1f47a35 vmlinux-old.o
> 27342197 4643015 814852 32800064 1f47d40 vmlinux-new.o
I don't think adding an instruction for an old-microarchitecture
weakness that has been fixed in new hardware already is worth bloating
the kernel.
Cannon Lake was released in 2018, 7 years ago.
It will be 1-2 years until such a change percolates to Linux users, and
by that time the microarchitecture with the fix (Cannon Lake) will be a
decade old, and a majority of Intel CPU users will be using it.
So I don't think this particular change is worth it, unless the false
dependency can be quantified to have a huge impact on pre-Cannon-Lake
CPUs - which I don't think it is.
Thanks,
Ingo