Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions

From: Jarkko Sakkinen
Date: Tue Apr 05 2022 - 18:55:20 EST


On Tue, 2022-04-05 at 17:19 +0300, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 16:40 +0300, Jarkko Sakkinen wrote:
> > On Tue, 2022-04-05 at 08:07 +0300, Jarkko Sakkinen wrote:
> > > On Tue, 2022-04-05 at 08:03 +0300, Jarkko Sakkinen wrote:
> > > > On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > > > > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > > > > created with permissions that support all usages of the pages, from the
> > > > > time the enclave is initialized until it is unloaded. For example,
> > > > > pages used by a JIT compiler or when code needs to otherwise be
> > > > > relocated need to always have RWX permissions.
> > > > >
> > > > > SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
> > > > > and can be used to restrict the EPCM permissions of regular enclave
> > > > > pages within an initialized enclave.
> > > > >
> > > > > Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
> > > > > restricting EPCM permissions. With this ioctl() the user specifies
> > > > > a page range and the EPCM permissions to be applied to all pages in
> > > > > the provided range. ENCLS[EMODPR] is run to restrict the EPCM
> > > > > permissions followed by the ENCLS[ETRACK] flow that will ensure
> > > > > no cached linear-to-physical address mappings to the changed
> > > > > pages remain.
> > > > >
> > > > > It is possible for the permission change request to fail on any
> > > > > page within the provided range, either with an error encountered
> > > > > by the kernel or by the SGX hardware while running
> > > > > ENCLS[EMODPR]. To support partial success the ioctl() returns an
> > > > > error code based on failures encountered by the kernel as well
> > > > > as two result output parameters: one for the number of pages
> > > > > that were successfully changed and one for the SGX return code.
> > > > >
> > > > > The page table entry permissions are not impacted by the EPCM
> > > > > permission changes. VMAs and PTEs will continue to allow the
> > > > > maximum vetted permissions determined at the time the pages
> > > > > are added to the enclave. The SGX error code in a page fault
> > > > > will indicate if it was an EPCM permission check that prevented
> > > > > an access attempt.
> > > > >
> > > > > No checking is done to ensure that the permissions are actually
> > > > > being restricted. This is because the enclave may have relaxed
> > > > > the EPCM permissions from within the enclave without letting the
> > > > > kernel know. An attempt to relax permissions using this call will
> > > > > be ignored by the hardware.
> > > > >
> > > > > Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> > > > > ---
> > > > > Changes since V2:
> > > > > - Include the sgx_ioc_sgx2_ready() utility
> > > > >   that previously was in "x86/sgx: Support relaxing of enclave page
> > > > >   permissions" that is removed from the next version.
> > > > > - Few renames requested by Jarkko:
> > > > >   struct sgx_enclave_restrict_perm ->
> > > > >          struct sgx_enclave_restrict_permissions
> > > > >   sgx_enclave_restrict_perm()     ->
> > > > >          sgx_enclave_restrict_permissions()
> > > > >   sgx_ioc_enclave_restrict_perm() ->
> > > > >          sgx_ioc_enclave_restrict_permissions()
> > > > > - Make EPCM permissions independent from kernel view of
> > > > >   permissions.  (Jarkko)
> > > > >   - Remove attempt at runtime tracking of EPCM permissions
> > > > >     (sgx_encl_page->vm_run_prot_bits).
> > > > >   - Do not flush page table entries - they are no longer impacted by
> > > > >     EPCM permission changes.
> > > > >   - Modify changelog to reflect new architecture.
> > > > > - Ensure at least PROT_READ is requested - enclave requires read
> > > > >   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
> > > > >
> > > > > Changes since V1:
> > > > > - Change terminology to use "relax" instead of "extend" to refer to
> > > > >   the case when enclave page permissions are added (Dave).
> > > > > - Use ioctl() in commit message (Dave).
> > > > > - Add examples on what permissions would be allowed (Dave).
> > > > > - Split enclave page permission changes into two ioctl()s, one for
> > > > >   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
> > > > >   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
> > > > >   (Jarkko).
> > > > > - In support of the ioctl() name change the following names have been
> > > > >   changed:
> > > > >   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
> > > > >   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
> > > > >   sgx_page_modp() -> sgx_enclave_restrict_perm()
> > > > > - ioctl() takes entire secinfo as input instead of
> > > > >   page permissions only (Jarkko).
> > > > > - Fix kernel-doc to include () in function name.
> > > > > - Create and use utility for the ETRACK flow.
> > > > > - Fixups in comments
> > > > > - Move kernel-doc to function that provides documentation for
> > > > >   Documentation/x86/sgx.rst.
> > > > > - Remove redundant comment.
> > > > > - Make explicit which members of struct sgx_enclave_restrict_perm
> > > > >   are for output (Dave).
> > > > >
> > > > >  arch/x86/include/uapi/asm/sgx.h |  21 +++
> > > > >  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
> > > > >  2 files changed, 263 insertions(+)
> > > > >
> > > > > diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> > > > > index f4b81587e90b..a0a24e94fb27 100644
> > > > > --- a/arch/x86/include/uapi/asm/sgx.h
> > > > > +++ b/arch/x86/include/uapi/asm/sgx.h
> > > > > @@ -29,6 +29,8 @@ enum sgx_page_flags {
> > > > >         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
> > > > >  #define SGX_IOC_VEPC_REMOVE_ALL \
> > > > >         _IO(SGX_MAGIC, 0x04)
> > > > > +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> > > > > +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> > > > >  
> > > > >  /**
> > > > >   * struct sgx_enclave_create - parameter structure for the
> > > > > @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
> > > > >         __u64 fd;
> > > > >  };
> > > > >  
> > > > > +/**
> > > > > + * struct sgx_enclave_restrict_permissions - parameters for ioctl
> > > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > > + * @offset:    starting page offset (page aligned relative to enclave base
> > > > > + *             address defined in SECS)
> > > > > + * @length:    length of memory (multiple of the page size)
> > > > > + * @secinfo:   address for the SECINFO data containing the new permission bits
> > > > > + *             for pages in range described by @offset and @length
> > > > > + * @result:    (output) SGX result code of ENCLS[EMODPR] function
> > > > > + * @count:     (output) bytes successfully changed (multiple of page size)
> > > > > + */
> > > > > +struct sgx_enclave_restrict_permissions {
> > > > > +       __u64 offset;
> > > > > +       __u64 length;
> > > > > +       __u64 secinfo;
> > > > > +       __u64 result;
> > > > > +       __u64 count;
> > > > > +};
> > > > > +
> > > > >  struct sgx_enclave_run;
> > > > >  
> > > > >  /**
> > > > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > index 0460fd224a05..4d88bfd163e7 100644
> > > > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
> > > > >         return sgx_set_attribute(&encl->attributes_mask, params.fd);
> > > > >  }
> > > > >  
> > > > > +/*
> > > > > + * Ensure enclave is ready for SGX2 functions. Readiness is checked
> > > > > + * by ensuring the hardware supports SGX2 and the enclave is initialized
> > > > > + * and thus able to handle requests to modify pages within it.
> > > > > + */
> > > > > +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
> > > > > +{
> > > > > +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> > > > > +               return -ENODEV;
> > > > > +
> > > > > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Return valid permission fields from a secinfo structure provided by
> > > > > + * user space. The secinfo structure is required to only have bits in
> > > > > + * the permission fields set.
> > > > > + */
> > > > > +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > > > > +{
> > > > > +       struct sgx_secinfo secinfo;
> > > > > +       u64 perm;
> > > > > +
> > > > > +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
> > > > > +                          sizeof(secinfo)))
> > > > > +               return -EFAULT;
> > > > > +
> > > > > +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > > > > +
> > > > > +       /*
> > > > > +        * Read access is required for the enclave to be able to use the page.
> > > > > +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
> > > > > +        * read access.
> > > > > +        */
> > > > > +       if (!(perm & SGX_SECINFO_R))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       *secinfo_perm = perm;
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Some SGX functions require that no cached linear-to-physical address
> > > > > + * mappings are present before they can succeed. Collaborate with
> > > > > + * hardware via ENCLS[ETRACK] to ensure that all cached
> > > > > + * linear-to-physical address mappings belonging to all threads of
> > > > > + * the enclave are cleared. See sgx_encl_cpumask() for details.
> > > > > + */
> > > > > +static int sgx_enclave_etrack(struct sgx_encl *encl)
> > > > > +{
> > > > > +       void *epc_virt;
> > > > > +       int ret;
> > > > > +
> > > > > +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> > > > > +       ret = __etrack(epc_virt);
> > > > > +       if (ret) {
> > > > > +               /*
> > > > > +                * ETRACK only fails when there is an OS issue. For
> > > > > +                * example, two consecutive ETRACK was sent without
> > > > > +                * completed IPI between.
> > > > > +                */
> > > > > +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> > > > > +               /*
> > > > > +                * Send IPIs to kick CPUs out of the enclave and
> > > > > +                * try ETRACK again.
> > > > > +                */
> > > > > +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > > +               ret = __etrack(epc_virt);
> > > > > +               if (ret) {
> > > > > +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
> > > > > +                                   ret, ret);
> > > > > +                       return -EFAULT;
> > > > > +               }
> > > > > +       }
> > > > > +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
> > > > > + * @encl:      Enclave to which the pages belong.
> > > > > + * @modp:      Checked parameters from user on which pages need modifying.
> > > > > + * @secinfo_perm: New (validated) permission bits.
> > > > > + *
> > > > > + * Return:
> > > > > + * - 0:                Success.
> > > > > + * - -errno:   Otherwise.
> > > > > + */
> > > > > +static long
> > > > > +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > > +                                struct sgx_enclave_restrict_permissions *modp,
> > > > > +                                u64 secinfo_perm)
> > > > > +{
> > > > > +       struct sgx_encl_page *entry;
> > > > > +       struct sgx_secinfo secinfo;
> > > > > +       unsigned long addr;
> > > > > +       unsigned long c;
> > > > > +       void *epc_virt;
> > > > > +       int ret;
> > > > > +
> > > > > +       memset(&secinfo, 0, sizeof(secinfo));
> > > > > +       secinfo.flags = secinfo_perm;
> > > > > +
> > > > > +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> > > > > +               addr = encl->base + modp->offset + c;
> > > > > +
> > > > > +               mutex_lock(&encl->lock);
> > > > > +
> > > > > +               entry = sgx_encl_load_page(encl, addr);
> > > > > +               if (IS_ERR(entry)) {
> > > > > +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               /*
> > > > > +                * Changing EPCM permissions is only supported on regular
> > > > > +                * SGX pages. Attempting this change on other pages will
> > > > > +                * result in #PF.
> > > > > +                */
> > > > > +               if (entry->type != SGX_PAGE_TYPE_REG) {
> > > > > +                       ret = -EINVAL;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               /*
> > > > > +                * Do not verify the permission bits requested. Kernel
> > > > > +                * has no control over how EPCM permissions can be relaxed
> > > > > +                * from within the enclave. ENCLS[EMODPR] can only
> > > > > +                * remove existing EPCM permissions, attempting to set
> > > > > +                * new permissions will be ignored by the hardware.
> > > > > +                */
> > > > > +
> > > > > +               /* Change EPCM permissions. */
> > > > > +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> > > > > +               ret = __emodpr(&secinfo, epc_virt);
> > > > > +               if (encls_faulted(ret)) {
> > > > > +                       /*
> > > > > +                        * All possible faults should be avoidable:
> > > > > +                        * parameters have been checked, will only change
> > > > > +                        * permissions of a regular page, and no concurrent
> > > > > +                        * SGX1/SGX2 ENCLS instructions since these
> > > > > +                        * are protected with mutex.
> > > > > +                        */
> > > > > +                       pr_err_once("EMODPR encountered exception %d\n",
> > > > > +                                   ENCLS_TRAPNR(ret));
> > > > > +                       ret = -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +               if (encls_failed(ret)) {
> > > > > +                       modp->result = ret;
> > > > > +                       ret = -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               ret = sgx_enclave_etrack(encl);
> > > > > +               if (ret) {
> > > > > +                       ret = -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               mutex_unlock(&encl->lock);
> > > > > +       }
> > > > > +
> > > > > +       ret = 0;
> > > > > +       goto out;
> > > > > +
> > > > > +out_unlock:
> > > > > +       mutex_unlock(&encl->lock);
> > > > > +out:
> > > > > +       modp->count = c;
> > > > > +
> > > > > +       return ret;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * sgx_ioc_enclave_restrict_permissions() - handler for
> > > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > > + * @encl:      an enclave pointer
> > > > > + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
> > > > > + *             instance
> > > > > + *
> > > > > + * SGX2 distinguishes between relaxing and restricting the enclave page
> > > > > + * permissions maintained by the hardware (EPCM permissions) of pages
> > > > > + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
> > > > > + *
> > > > > + * EPCM permissions cannot be restricted from within the enclave, the enclave
> > > > > + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
> > > > > + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
> > > > > + * will be ignored by the hardware.
> > > > > + *
> > > > > + * Return:
> > > > > + * - 0:                Success
> > > > > + * - -errno:   Otherwise
> > > > > + */
> > > > > +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > > +                                                void __user *arg)
> > > > > +{
> > > > > +       struct sgx_enclave_restrict_permissions params;
> > > > > +       u64 secinfo_perm;
> > > > > +       long ret;
> > > > > +
> > > > > +       ret = sgx_ioc_sgx2_ready(encl);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       if (copy_from_user(&params, arg, sizeof(params)))
> > > > > +               return -EFAULT;
> > > > > +
> > > > > +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
> > > > > +                                        &secinfo_perm);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       if (params.result || params.count)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
> > > > > +
> > > > > +       if (copy_to_user(arg, &params, sizeof(params)))
> > > > > +               return -EFAULT;
> > > > > +
> > > > > +       return ret;
> > > > > +}
> > > > > +
> > > > >  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > > >  {
> > > > >         struct sgx_encl *encl = filep->private_data;
> > > > > @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > > >         case SGX_IOC_ENCLAVE_PROVISION:
> > > > >                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
> > > > >                 break;
> > > > > +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
> > > > > +               ret = sgx_ioc_enclave_restrict_permissions(encl,
> > > > > +                                                          (void __user *)arg);
> > > > > +               break;
> > > > >         default:
> > > > >                 ret = -ENOIOCTLCMD;
> > > > >                 break;
> > > >
> > > > I think this a big improvement all things considered. I just put 
> > > > a kernel building and see if I get this wired to our code:
> > > >
> > > > https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943
> > > >
> > > > I'll report my findings later on.
> > >
> > > I pulled the patches from sgx2_submitted_v3_plus_rwx branch. Just
> > > sanity checking that it is v3, correct?
> >
> > I'm getting EINVAL with SECINFO that I think is legit:
> >
> > let mut secinfo_buf: [u8; 64] = [0; 64]; // Initialize with zeros
> > secinfo_buf[0] = 1; // READ
> > secinfo_buf[1] = 2; // Regular
> >
> > I made a small bpftrace script, and here's what happens:
> >
> > $ cat sgx.bt
> > kretprobe:sgx_ioctl /retval != 0/
> > {
> >         printf("sgx_ioctl: %d\n", retval)
> > }
> >
> > kretprobe:sgx_perm_from_user_secinfo.constprop.0 /retval/
> > {
> >         printf("sgx_perm_from_user_secinfo.constprop.0 %d\n", retval)
> > }
> >
> > kretprobe:sgx_enclave_restrict_permissions /retval/
> > {
> >         printf("sgx_enclave_restrict_permissions: %d\n", retval)
> > }
> >
> > $ sudo bpftrace sgx.bt
> > [sudo] password for jarkko:
> > Attaching 3 probes...
> > sgx_perm_from_user_secinfo.constprop.0 -22
> > sgx_ioctl: -22
> >
> > Could be that I'm doing something wrong but instantly do not see
> > anything obvious...
>
> It was my bad, i.e.
>
> let mut secinfo_buf: [u8; 64] = [0; 64];
> secinfo_buf[0] = 1;
> secinfo_buf[1] = 0;
>  
> BR, Jarkko

According to SDM having page type as regular is fine for EMODPR,
i.e. that's why I did not care about having it in SECINFO.

Given that the opcode itself contains validation, I wonder
why this needs to be done:

if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
return -EINVAL;

if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
return -EINVAL;

perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;

I.e. why duplicate validation and why does it have different
invariant than the opcode?

While looking into this I also noticed:

static int sgx_validate_offset_length(struct sgx_encl *encl,
unsigned long offset,
unsigned long length)
{
if (!IS_ALIGNED(offset, PAGE_SIZE))
return -EINVAL;

if (!length || length & (PAGE_SIZE - 1))
return -EINVAL;

I guess also for length would be good idea to use IS_ALIGNED()
(this inconsistency inherits from the pre-existing code).

BR, Jarkko