Re: SGX vs LSM (Re: [PATCH v20 00/28] Intel SGX1 support)

From: Sean Christopherson
Date: Thu May 16 2019 - 20:05:40 EST

On Wed, May 15, 2019 at 11:27:04AM -0700, Andy Lutomirski wrote:
> Here's a very vague proposal that's kind of like what I've been
> thinking over the past few days. The SGX inode could track, for each
> page, a "safe-to-execute" bit. When you first open /dev/sgx/enclave,
> you get a blank enclave and all pages are safe-to-execute. When you
> do the ioctl to load context (which could be code, data, or anything
> else), the kernel will check whether the *source* VMA is executable
> and, if not, mark the page of the enclave being loaded as unsafe.
> Once the enclave is initialized, the driver will clear the
> safe-to-execute bit for any page that is successfully mapped writably.
> The intent is that a page of the enclave is safe-to-execute if that
> page was populated from executable memory and not modified since then.
> LSMs could then enforce a policy that you can map an enclave page RX
> if the page is safe-to-execute, you can map any page you want for
> write if there are no executable mappings, and you can only map a page
> for write and execute simultaneously if you can EXECMOD permission.
> This should allow an enclave to be loaded by userspace from a file
> with EXECUTE rights.

I'm still confused as to why you want to track execute permissions on the
enclave pages and add SGX-specific LSM hooks. Is there anything that
prevents userspace from building the enclave like any other DSO and then
copying it into enclave memory? I feel like I'm missing something.

1. Userspace loads enclave into regular memory, e.g. like a normal DSO.
All mmap(), mprotect(), etc... calls are subject to all existing
LSM policies.

2. Userspace opens /dev/sgx/enclave to instantiate a new enclave.

3. Userspace uses mmap() to allocate virtual memory for its enclave,
again subject to all existing LSM policies (sane userspaces map it RO
since the permissions eventually get tossed anyways).

4. SGX subsystem refuses to service page faults for enclaves that have
not yet been initialized, e.g. signals SIGBUS or SIGSEGV.

5. Userspace invokes SGX ioctl() to copy enclave from regulary VMA to
enclave VMA.

6. SGX ioctl() propagates VMA protection-related flags from source VMA
to enclave VMA, e.g. invokes mprotect_fixup(). Enclave VMA(s) may
be split as part of this process.

7. At all times, mprotect() calls on the enclave VMA are subject to
existing LSM policies, i.e. it's not special cased for enclaves.

The SGX ioctl() would need to take mmap_sem for write, but we can mitigate
that issue by changing the ioctl() to take a range of memory instead of a
single page. That'd also provide "EADD batching" that folks have