But in my measurements POPF is not fast even in the case where restored
flags are not changed at all:
mov $200*1000*1000, %eax
pushf
pop %rbx
.balign 64
loop: push %rbx
popf
dec %eax
jnz loop
# perf stat -r20 ./popf_1g
4,929,012,093 cycles # 3.412 GHz ( +- 0.02% )
835,721,371 instructions # 0.17 insn per cycle ( +- 0.02% )
1.446185359 seconds time elapsed ( +- 0.46% )
If I replace POPF with a pop into an unused register, I get this:
You are comparing apples and bananas here.