Re: [PATCH 2/3] x86: entry_64.S: use PUSH insns to build pt_regs on stack
From: Denys Vlasenko
Date: Wed Mar 18 2015 - 17:33:18 EST
On 03/18/2015 10:22 PM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 2:12 PM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>> On 03/18/2015 10:01 PM, Andy Lutomirski wrote:
>>> On Wed, Mar 18, 2015 at 12:47 PM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>>>> We lose a number of large insns there:
>>>>
>>>> text data bss dec hex filename
>>>> 9863 0 0 9863 2687 entry_64_before.o
>>>> 9671 0 0 9671 25c7 entry_64.o
>>>>
>>>> What's more important, we convert two "MOVQ $imm,off(%rsp)" to "PUSH $imm"
>>>> (the ones which fill pt_regs->cs,ss).
>>>>
>>>> Before this patch, placing them on fast path was slowing it down by two cycles:
>>>> this form of MOV is very large, 12 bytes, and this probably reduces decode bandwidth
>>>> to one insn per cycle when it meets them.
>>>> Therefore they were living in FIXUP_TOP_OF_STACK instead (away from hot path).
>>>
>>> Does that mean that this has zero performance impact, or is it
>>> actually a speedup?
>>
>>
>> No, it's not a speedup because those big bad instructions weren't
>> on hot path to begin with.
>>
>> We want them to be there.
>>
>> Inserting them in a form of MOVs into hot path (say, in order
>> to eliminate FIXUP_TOP_OF_STACK) *would be* a slowdown.
>>
>> But we switch to PUSH method, and then inserting them _as PUSHes_
>> seems to be a wash.
>>
>
> Sorry, what I meant was: what was the performance impact of this patch
> on fast-path syscalls?
I measured the next patch (which added one additional push)
and it was a wash compared to timings before both patches.
See comment there.
I did not measure this patch in isolation this time around,
on the previous iteration of this patch it was a single-cycle speedup.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/