Re: [PATCH] x86: entry_32.S: change ESPFIX test to not touch PT_OLDSS(%esp)

From: Andy Lutomirski
Date: Mon Mar 09 2015 - 14:05:23 EST

On Mon, Mar 9, 2015 at 10:59 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Mar 9, 2015 at 10:45 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> If sp0 is set to the very top of the stack, then an NMI immediately
>> after sysenter will have OLDSS off the top of the stack, and reading
>> it can crash. This is why 32-bit kernels have a (buggy!) 8 byte
>> offset in sp0.
> So I think that for sysenter, we *should* have that 8-byte buffer.
> Not in general for sp0, but for MSR_IA32_SYSENTER_ESP (which is sp1, afaik).

Bah. I'm slightly wrong here. It's not NMI immediately after
sysenter (that's handled by the emergency stack gunk, although we
might need to fix that too). It's this:

CFI_DEF_CFA esp, 0
movl TSS_sysenter_sp0(%esp),%esp
* Interrupts are disabled here, but we can't trace it until
* enough kernel state to call TRACE_IRQS_OFF can be called - but
* we immediately enable interrupts at that point anyway.

--> NMI here

pushl_cfi $__USER_DS

--> or here

/*CFI_REL_OFFSET ss, 0*/
pushl_cfi %ebp

We could fix that locally by loading sp0 - 8 atomically (w/ percpu
help) and populating the top two words with mov instead of push.

One option would be to change the NMI entry code to move itself down 8
bytes if this happens (came from kernel mode or sp == sp0 - 12,
perhaps). Or we could suck it up and keep that 8 byte offset (and fix
the cpu_tss initializer bug that threw me off in the first place).


> Just make the rule be that you can never ever have a kernel stack
> frame that doesn't contain room for ss/sp at the top.
> We have various code that looks at and touches "pt_regs" anyway, and
> accesses things out for debugging/oopsing/tracing etc. Let's not make
> the rule be that you cannot look at regs->ss without checking various
> random other fields first.
> Linus

Andy Lutomirski
AMA Capital Management, LLC
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at