Re: [PATCH v23 12/24] x86/sgx: Linux Enclave Driver

From: Sean Christopherson
Date: Wed Oct 30 2019 - 05:30:48 EST


On Tue, Oct 29, 2019 at 11:29:20AM +0200, Jarkko Sakkinen wrote:
> On Mon, Oct 28, 2019 at 11:03:12PM +0200, Jarkko Sakkinen wrote:
> > +/**
> > + * sgx_ioc_enclave_add_pages() - The handler for %SGX_IOC_ENCLAVE_ADD_PAGES
> > + * @encl: pointer to an enclave instance (via ioctl() file pointer)
> > + * @arg: a user pointer to a struct sgx_enclave_add_pages instance
> > + *
> > + * Add (EADD) one or more pages to an uninitialized enclave, and optionally
> > + * extend (EEXTEND) the measurement with the contents of the page. The range of
> > + * pages must be virtually contiguous. The SECINFO and measurement mask are
> > + * applied to all pages, i.e. pages with different properties must be added in
> > + * separate calls.
> > + *
> > + * A SECINFO for a TCS is required to always contain zero permissions because
> > + * CPU silently zeros them. Allowing anything else would cause a mismatch in
> > + * the measurement.
> > + *
> > + * mmap()'s protection bits are capped by the page permissions. For each page
> > + * address, the maximum protection bits are computed with the following
> > + * heuristics:
> > + *
> > + * 1. A regular page: PROT_R, PROT_W and PROT_X match the SECINFO permissions.
> > + * 2. A TCS page: PROT_R | PROT_W.
> > + * 3. No page: PROT_NONE.
> > + *
> > + * mmap() is not allowed to surpass the minimum of the maximum protection bits
> > + * within the given address range.
> > + *
> > + * As stated above, a non-existent page is interpreted as a page with no
> > + * permissions. In effect, this allows mmap() with PROT_NONE to be used to seek
> > + * an address range for the enclave that can be then populated into SECS.
> > + *
> > + * @arg->addr, @arg->src and @arg->length are adjusted to reflect the
> > + * remaining pages that need to be added to the enclave, e.g. userspace can
> > + * re-invoke SGX_IOC_ENCLAVE_ADD_PAGES using the same struct in response to an
> > + * ERESTARTSYS error.
> > + *
> > + * Return:
> > + * 0 on success,
> > + * -EINVAL if any input param or the SECINFO contains invalid data,
> > + * -EACCES if an executable source page is located in a noexec partition,
> > + * -ENOMEM if any memory allocation, including EPC, fails,
> > + * -ERESTARTSYS if a pending signal is recognized
> > + */
> > +static long sgx_ioc_enclave_add_pages(struct sgx_encl *encl, void __user *arg)
>
> This should return the number of pages processed instead of zero on
> success. Kernel needs to be able to cap the amount it will process.

Why? The number of pages processed is effectively returned via the params
on any error, e.g. wouldn't it be more appropriate to return -ERESTARTSYS?
And I don't see any reason to add an arbitrary cap on the number of pages,
e.g. SGX plays nice with the scheduler and signals, and restricting the
number of EPC pages available to a process via cgroups (returning -ENOMEM)
is a better solution for managing EPC.