Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more robust
From: Thomas Gleixner
Date: Fri Jul 05 2019 - 16:36:44 EST
On Fri, 5 Jul 2019, Andy Lutomirski wrote:
> On Fri, Jul 5, 2019 at 8:47 AM Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> > Because TPR is 0, an incoming IPI can trigger #AC, #CP, #VC or #SX
> > without an error code on the stack, which results in a corrupt pt_regs
> > in the exception handler, and a stack underflow on the way back out,
> > most likely with a fault on IRET.
> >
> > These can be addressed by setting TPR to 0x10, which will inhibit
> > delivery of any errant IPIs in this range, but some extra sanity logic
> > may not go amiss. An error code on a 64bit stack can be spotted with
> > `testb $8, %spl` due to %rsp being aligned before pushing the exception
> > frame.
>
> Several years ago, I remember having a discussion with someone (Jan
> Beulich, maybe?) about how to efficiently make the entry code figure
> out the error code situation automatically. I suspect it was on IRC
> and I can't find the logs. I'm thinking that maybe we should just
> make Linux's idtentry code do something like this.
>
> If nothing else, we could make idtentry do:
>
> testl $8, %esp /* shorter than testb IIRC */
> jz 1f /* or jnz -- too lazy to figure it out */
> pushq $-1
> 1:
Errm, no. We should not silently paper over it. If we detect that this came
in with a wrong stack frame, i.e. not from a CPU originated exception, then
we truly should yell loud. Also in that case you want to check the APIC:ISR
and issue an EOI to clear it.
> > Another interesting problem is an IPI which its vector 0x80. A cunning
> > attacker can use this to simulate system calls from unsuspecting
> > positions in userspace, or for interrupting kernel context. At the very
> > least the int0x80 path does an unconditional swapgs, so will try to run
> > with the user gs, and I expect things will explode quickly from there.
>
> At least SMAP helps here on non-FSGSBASE systems. With FSGSBASE, I
How does it help? It still crashes the kernel.
> suppose we could harden this by adding a special check to int $0x80 to
> validate GSBASE.
> > One option here is to look at ISR and complain if it is found to be set.
>
> Barring some real hackery, we're toast long before we get far enough to
> do that.
No. We can map the APIC into the user space visible page tables for PTI
without compromising the PTI isolation and it can be read very early on
before SWAPGS. All you need is a register to clobber not more. It the ISR
is set, then go into an error path, yell loudly, issue EOI and return.
The only issue I can see is: It's slow :)
Thanks,
tglx