Re: [RFC PATCH v2 4/4] x86/vdso: Add __vdso_sgx_enter_enclave() to wrap SGX enclave transitions

From: Sean Christopherson
Date: Fri Dec 07 2018 - 16:26:53 EST


On Fri, Dec 07, 2018 at 12:16:59PM -0800, Andy Lutomirski wrote:
>
>
> > On Dec 7, 2018, at 12:09 PM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> >
> >> On Fri, Dec 07, 2018 at 11:23:10AM -0800, Andy Lutomirski wrote:
> >>
> >> Ah, I see. Youâre saying that, if the non-enclave stare is corrupted such
> >> that RIP is okay and RSP still points somewhere reasonable but the return
> >> address is garbage, then we can at least get to the fault handler and print
> >> something?
> >
> > Yep. Even for something more subtle like GPR corruption it could dump the
> > entire call stack before attempting to return back up.
> >
> >> This only works if the fault handler pointer itself is okay, though, which
> >> somewhat limits the usefulness, given that its pointer is quite likely to
> >> be on the stack very close to the return address.
> >
> > Yeah, it's not a silver bullet by any means, but it does seem useful for at
> > least some scenarios. Even exploding when invoking the handler instead of
> > at a random point might prove useful, e.g. "calling my exit handler exploded,
> > maybe my enclave corrupted the stack!".
>
> Hereâs another idea: calculate some little hash or other checksum of
> RSP, RBP, and perhaps a couple words on the stack, and do:

Corrupting RSP and RBP as opposed to the stack memory seems much less
likely since the enclave would have to poke into the save state area.
And as much as I dislike the practice of intentionally manipulating
SSA.RSP, preventing the user from doing something because we're "helping"
doesn't seem right.

> call __vdso_enclave_corrupted_state
>
> If you get a mismatch after return. That function could be:
>
> call __vdso_enclave_corrupted_state:
> ud2
>
> And now the debug trace makes it very clear what happened.
>
> This may or may not be worth the effort.

Running a checksum on the stack for every exit doesn't seem like it'd
be worth the effort, especially since this type of bug should be quite
rare, at least in production environments.

If we want to pursue the checksum idea I think the easiest approach
would be to combine it with an exit_handler and do a simple check on
the handler. It'd be minimal overhead in the fast path and would flag
cases where invoking exit_handle() would explode, while deferring all
other checks to the user.

E.g. something like this:

diff --git a/arch/x86/entry/vdso/vsgx_enter_enclave.c b/arch/x86/entry/vdso/vsgx_enter_enclave.c
index d5145e5c5a54..c89dd3cd8da9 100644
--- a/arch/x86/entry/vdso/vsgx_enter_enclave.c
+++ b/arch/x86/entry/vdso/vsgx_enter_enclave.c
@@ -42,10 +42,13 @@ enum sgx_enclu_leaf {
SGX_EEXIT = 4,
};

+#define VDSO_MAGIC 0xa5a5a5a5a5a5a5a5UL
+
notrace long __vdso_sgx_enter_enclave(u32 op, void *tcs, void *priv,
struct sgx_enclave_exit_info *exit_info,
sgx_enclave_exit_handler *exit_handler)
{
+ volatile unsigned long hash;
u64 rdi, rsi, rdx;
u32 leaf;
long ret;
@@ -53,6 +56,9 @@ notrace long __vdso_sgx_enter_enclave(u32 op, void *tcs, void *priv,
if (!tcs || !exit_info)
return -EINVAL;

+ /* Always hash the handler. XOR is much cheaper than Jcc. */
+ hash = (unsigned long)exit_handler ^ VDSO_MAGIC;
+
enter_enclave:
if (op != SGX_EENTER && op != SGX_ERESUME)
return -EINVAL;
@@ -107,6 +113,8 @@ notrace long __vdso_sgx_enter_enclave(u32 op, void *tcs, void *priv,
* or to return (EEXIT).
*/
if (exit_handler) {
+ if (hash != ((unsigned long)exit_handler ^ VDSO_MAGIC))
+ asm volatile("ud2\n");
if (exit_handler(exit_info, tcs, priv)) {
op = exit_info->leaf;
goto enter_enclave;

> But ISTM the enclave is almost as likely to corrupt the host state and
> the. EEXIT as it is to corrupt the host state and then fault.

Agreed, I would say even more likely. But the idea is that the
exit_handler is called on any exit, not just exceptions.