Re: [PATCH 2/3] x86: entry_64.S: use PUSH insns to build pt_regs on stack

From: Denys Vlasenko
Date: Wed Mar 18 2015 - 17:13:27 EST

On 03/18/2015 10:01 PM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 12:47 PM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>> We lose a number of large insns there:
>> text data bss dec hex filename
>> 9863 0 0 9863 2687 entry_64_before.o
>> 9671 0 0 9671 25c7 entry_64.o
>> What's more important, we convert two "MOVQ $imm,off(%rsp)" to "PUSH $imm"
>> (the ones which fill pt_regs->cs,ss).
>> Before this patch, placing them on fast path was slowing it down by two cycles:
>> this form of MOV is very large, 12 bytes, and this probably reduces decode bandwidth
>> to one insn per cycle when it meets them.
>> Therefore they were living in FIXUP_TOP_OF_STACK instead (away from hot path).
> Does that mean that this has zero performance impact, or is it
> actually a speedup?

No, it's not a speedup because those big bad instructions weren't
on hot path to begin with.

We want them to be there.

Inserting them in a form of MOVs into hot path (say, in order
to eliminate FIXUP_TOP_OF_STACK) *would be* a slowdown.

But we switch to PUSH method, and then inserting them _as PUSHes_
seems to be a wash.

