Re: [RFC PATCH] x86: optimize IRET returns to kernel

From: Andy Lutomirski
Date: Sat Apr 04 2015 - 12:55:09 EST


On Tue, Mar 31, 2015 at 8:59 AM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
> On 03/31/2015 03:54 PM, Andy Lutomirski wrote:
>> On Tue, Mar 31, 2015 at 5:46 AM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>>> This is not proposed to be merged yet.
>>>
>>> Andy, this patch is in spirit of your crazy ideas of repurposing
>>> instructions for the roles they weren't intended for :)
>>>
>>> Recently I measured IRET timings and was newly "impressed"
>>> how slow it is. 200+ cycles. So I started thinking...
>>>
>>> When we return from interrupt/exception *to kernel*,
>>> most of IRET's doings are not necessary. CS and SS
>>> do not need changing. And in many (most?) cases
>>> saved RSP points right at the top of pt_regs,
>>> or (top of pt_regs+8).
>>>
>>> In which case we can (ab)use POPF and RET!
>>>
>>> Please see the patch.
>>
>> I have an old attempt at this here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=fast-return-to-kernel&id=6cfe29821979c42cd812878e05577f69f99fafaf
>
> Your version is better :/
>
> I'd only suggest s/pop %rsp/mov (%rsp),%rsp/
>
> I suspect "pop %rsp" is not an easy insn for CPU to digest.
>
>> If I were doing it again, I'd add a bit more care: if saved eflags
>> have RF set (can kgdb do that?), then we have to use iret.
>
> Good idea, we can even be paranoid and jump to real IRET if any
> of "unusual" flags are set.
>
>> I think that, if returning to IF=1, you need to do sti;ret to avoid an
>> infinite stack usage failure in which, during an IRQ storm, each IRQ
>> adds around one word of stack utilization because you haven't done the
>> ret yet before the next IRQ comes in. To make that robust, I'd adjust
>> the NMI code to clear IF and back up one instruction if it interrupts
>> after sti.
>
> I kinda hoped POPF is secretly a shadowing insn too.
> Experiments show it is not.
>

I'll fiddle with this some more at some point. First I want to get
rid of IST for #DB and #BP, which will reduce the number of funny
cases to think about. I hope to have patches for that ready short
after the next merge window closes.

--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/