Re: [PATCH v17 18/23] platform/x86: Intel SGX driver

From: Sean Christopherson
Date: Tue Dec 18 2018 - 13:53:53 EST


On Tue, Dec 18, 2018 at 07:44:18AM -0800, Sean Christopherson wrote:
> On Mon, Dec 17, 2018 at 08:59:54PM -0800, Andy Lutomirski wrote:
> > On Mon, Dec 17, 2018 at 2:20 PM Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx> wrote:
> > >
> >
> > > My brain is still sorting out the details, but I generally like the idea
> > > of allocating an anon inode when creating an enclave, and exposing the
> > > other ioctls() via the returned fd. This is essentially the approach
> > > used by KVM to manage multiple "layers" of ioctls across KVM itself, VMs
> > > and vCPUS. There are even similarities to accessing physical memory via
> > > multiple disparate domains, e.g. host kernel, host userspace and guest.
> > >
> >
> > In my mind, opening /dev/sgx would give you the requisite inode. I'm
> > not 100% sure that the chardev infrastructure allows this, but I think
> > it does.
>
> My fd/inode knowledge is lacking, to say the least. Whatever works, so
> long as we have a way to uniquely identify enclaves.

Actually, while we're dissecting the interface...

What if we re-organize the ioctls in such a way that we leave open the
possibility of allocating raw EPC for KVM via /dev/sgx? I'm not 100%
positive this approach will work[1], but conceptually it fits well with
KVM's memory model, e.g. KVM is aware of the GPA<->HVA association but
generally speaking doesn't know what's physically backing each memory
region.

Tangentially related, I think we should support allocating multiple
enclaves from a single /dev/sgx fd, i.e. a process shouldn't have to
open /dev/sgx every time it wants to create a new enclave.

Something like this:

/dev/sgx
|
-> mmap() { return -EINVAL; }
|
-> unlocked_ioctl()
|
-> SGX_CREATE_ENCLAVE: { return alloc_enclave_fd(); }
| |
| -> mmap() { ... }
| |
| -> get_unmapped_area() {
| | if (enclave->size) {
| | if (!addr)
| | addr = enclave->base;
| | if (addr + len + pgoff > enclave->base + enclave->size)
| | return -EINVAL;
| | } else {
| | if (!validate_size(len))
| | return -EINVAL;
| | addr = naturally_align(len);
| | }
| | }
| |
| -> unlocked_ioctl() {
| SGX_ENCLAVE_ADD_PAGE: { ... }
| SGX_ENCLAVE_INIT: { ... }
| SGX_ENCLAVE_REMOVE_PAGES: { ... }
| SGX_ENCLAVE_MODIFY_PAGES: { ... }
| }
|
-> SGX_CREATE_VIRTUAL_EPC: {return alloc_epc_fd(); }
|
-> mmap() { ... }
|
-> get_unmapped_area() {<page aligned/sized> }
|
-> unlocked_ioctl() {
SGX_VIRTUAL_EPC_???:
SGX_VIRTUAL_EPC_???:
}


[1] Delegating EPC management to /dev/sgx is viable for virtualizing SGX
without oversubscribing EPC to guests, but oversubscribing EPC in a
VMM requires handling EPC-related VM-Exits and using instructions
that will #UD if the CPU is not post-VMXON. I *think* having KVM
forward VM-Exits to x86/sgx would work, but it's entirely possible
it'd be a complete cluster.