Re: [RFC] Circumventing FineIBT Via Entrypoints

From: Andrew Cooper
Date: Fri Feb 14 2025 - 18:07:02 EST


On 13/02/2025 11:24 pm, Jennifer Miller wrote:
> On Thu, Feb 13, 2025 at 09:24:18PM +0000, Andrew Cooper wrote:
>>>> ; swap stacks as normal
>>>> mov QWORD PTR gs:[rip+0x7f005f85],rsp # 0x6014 <cpu_tss_rw+20>
>>>> mov rsp,QWORD PTR gs:[rip+0x7f02c56d] # 0x2c618 <pcpu_hot+24>
>> ... these are memory accesses using the user %gs.  As you note a few
>> lines lower, %gs isn't safe at this point.
>>
>> A cunning attacker can make gs:[rip+0x7f02c56d] be a read-only mapping,
>> at point we'll have loaded an attacker controlled %rsp, then take #PF
>> trying to spill %rsp into pcpu_hot, and now we're running the pagefault
>> handler on an attacker controlled stack and gsbase.
>>
> I don't follow, the spill of %rsp into pcpu_hot occurs first, before we
> would move to the attacker controlled stack. This is Intel asm syntax,
> sorry if that was unclear.

No, sorry.  It's clearly written; I simply wasn't paying enough attention.

> Still, I hadn't considered misusing readonly/unmapped pages on the GPR
> register spill that follows. Could we enforce that the stack pointer we get
> be page aligned to prevent this vector? So that if one were to attempt to
> point the stack to readonly or unmapped memory they should be guaranteed to
> double fault?

Hmm.

Espfix64 does involve #DF recovering from a write to a read-only stack. 
(This broken corner of x86 is also fixed in FRED.   We fixed a *lot* of
thing.)

As long the #DF handler can be updated to safely distinguish espfix64
from this entrypoint attack, this seems like it might mitigate the
read-only case.
> I think we can do the overwrite at any point before actually calling into
> the individual syscall handlers, really anywhere before potentially
> hijacked indirect control flow can occur and then restore it just after
> those return e.g., for the 64-bit path I am currently overwriting it at the
> start of do_syscall_64 and then restoring it just before
> syscall_exit_to_user_mode. I'm not sure if there is any reason to do it
> sooner while we'd still be register constrained.

I don't follow.  If any "bad" execution is found in an entrypoint, Linux
needs to panic().  Detecting the malice involves clobbering an in-use
stack, and there's no ability to safely recover.

~Andrew