Re: [PATCH v26 10/22] x86/sgx: Linux Enclave Driver

From: Andy Lutomirski
Date: Thu Feb 20 2020 - 13:51:56 EST


On Thu, Feb 20, 2020 at 10:13 AM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Tue, Feb 18, 2020 at 07:26:31PM -0800, Jordan Hand wrote:
> > I ran our validation tests for the Open Enclave SDK against this patch
> > set and came across a potential issue.
> >
> > On 2/9/20 1:25 PM, Jarkko Sakkinen wrote:
> > > +/**
> > > + * sgx_encl_may_map() - Check if a requested VMA mapping is allowed
> > > + * @encl: an enclave
> > > + * @start: lower bound of the address range, inclusive
> > > + * @end: upper bound of the address range, exclusive
> > > + * @vm_prot_bits: requested protections of the address range
> > > + *
> > > + * Iterate through the enclave pages contained within [@start, @end) to verify
> > > + * the permissions requested by @vm_prot_bits do not exceed that of any enclave
> > > + * page to be mapped. Page addresses that do not have an associated enclave
> > > + * page are interpreted to zero permissions.
> > > + *
> > > + * Return:
> > > + * 0 on success,
> > > + * -EACCES if VMA permissions exceed enclave page permissions
> > > + */
> > > +int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
> > > + unsigned long end, unsigned long vm_prot_bits)
> > > +{
> > > + unsigned long idx, idx_start, idx_end;
> > > + struct sgx_encl_page *page;
> > > +
> > > + /* PROT_NONE always succeeds. */
> > > + if (!vm_prot_bits)
> > > + return 0;
> > > +
> > > + idx_start = PFN_DOWN(start);
> > > + idx_end = PFN_DOWN(end - 1);
> > > +
> > > + for (idx = idx_start; idx <= idx_end; ++idx) {
> > > + mutex_lock(&encl->lock);
> > > + page = radix_tree_lookup(&encl->page_tree, idx);
> > > + mutex_unlock(&encl->lock);
> > > +
> > > + if (!page || (~page->vm_max_prot_bits & vm_prot_bits))
> > > + return -EACCES;
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
> > > + unsigned long offset,
> > > + u64 secinfo_flags)
> > > +{
> > > + struct sgx_encl_page *encl_page;
> > > + unsigned long prot;
> > > +
> > > + encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
> > > + if (!encl_page)
> > > + return ERR_PTR(-ENOMEM);
> > > +
> > > + encl_page->desc = encl->base + offset;
> > > + encl_page->encl = encl;
> > > +
> > > + prot = _calc_vm_trans(secinfo_flags, SGX_SECINFO_R, PROT_READ) |
> > > + _calc_vm_trans(secinfo_flags, SGX_SECINFO_W, PROT_WRITE) |
> > > + _calc_vm_trans(secinfo_flags, SGX_SECINFO_X, PROT_EXEC);
> > > +
> > > + /*
> > > + * TCS pages must always RW set for CPU access while the SECINFO
> > > + * permissions are *always* zero - the CPU ignores the user provided
> > > + * values and silently overwrites them with zero permissions.
> > > + */
> > > + if ((secinfo_flags & SGX_SECINFO_PAGE_TYPE_MASK) == SGX_SECINFO_TCS)
> > > + prot |= PROT_READ | PROT_WRITE;
> > > +
> > > + /* Calculate maximum of the VM flags for the page. */
> > > + encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
> >
> > During mprotect (in mm/mprotect.c line 525) the following checks if
> > READ_IMPLIES_EXECUTE and a PROT_READ is being requested. If so and
> > VM_MAYEXEC is set, it also adds PROT_EXEC to the request.
> >
> > if (rier && (vma->vm_flags & VM_MAYEXEC))
> > prot |= PROT_EXEC;
> >
> > But if we look at sgx_encl_page_alloc(), we see vm_max_prot_bits is set
> > without taking VM_MAYEXEC into account:
> >
> > encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
> >
> > sgx_encl_may_map() checks that the requested protection can be added with:
> >
> > if (!page || (~page->vm_max_prot_bits & vm_prot_bits))
> > return -EACCESS
> >
> > This means that for any process where READ_IMPLIES_EXECUTE is set and
> > page where (vma->vm_flags & VM_MAYEXEC) == true, mmap/mprotect calls to
> > that request PROT_READ on a page that was not added with PROT_EXEC will
> > fail.
>
> I could've sworn this was discussed on the SGX list at one point, but
> apparently we only discussed it internally. Anyways...
>
> More than likely, the READ_IMPLIES_EXECUTE (RIE) crud rears its head
> because part of the enclave loader is written in assembly. Unless
> explicitly told otherwise, the linker assumes that any program with
> assembly code may need an executable stack, which leads to the RIE
> personality being set for the process. Here's a fantastic write up for
> more details: https://www.airs.com/blog/archives/518
>
> There are essentially two paths we can take:
>
> 1) Exempt EPC pages from RIE during mmap()/mprotect(), i.e. don't add
> PROT_EXEC for enclaves.

Seems reasonable.

Honestly, it probably makes sense to try to exempt almost everything
from RIE. I'd be a bit surprised if RIE is actually useful for
anything other than plain anonymous pages and private file mappings.