RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
From: Xing, Cedric
Date: Thu Jun 13 2019 - 13:04:44 EST
> From: Christopherson, Sean J
> Sent: Wednesday, June 12, 2019 3:03 PM
>
> > I think this model works quite well in an SGX1 world. The main thing
> > that makes me uneasy about this model is that, in SGX2, it requires
> > that an SGX2-compatible enclave loader must pre-declare to the kernel
> > whether it intends for its dynamically allocated memory to be
> > ALLOW_EXEC. If ALLOW_EXEC is set but not actually needed, it will
> > still fail if DENY_X_IF_ALLOW_WRITE ends up being set. The other
> > version below does not have this limitation.
>
> I'm not convinced this will be a meaningful limitation in practice,
> though that's probably obvious from my RFCs :-). That being said, the
> UAPI quirk is essentially a dealbreaker for multiple people, so let's
> drop #1.
>
> I discussed the options with Cedric offline, and he is ok with option #2
> *if* the idea actually translates to acceptable code and doesn't present
> problems for userspace and/or future SGX features.
>
> So, I'll work on an RFC series to implement #2 as described below. If
> it works out, yay! If not, i.e. option #2 is fundamentally broken, I'll
> shift my focus to Cedric's code (option #3).
>
> > > 2. Pre-check LSM permissions and dynamically track mappings to
> enclave
> > > pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> > > based on the pre-checked permissions.
> > >
> > > Pros: Does not impact SGX UAPI, medium kernel complexity
> > > Cons: Auditing is complex/weird, requires taking enclave-
> specific
> > > lock during mprotect() to query/update tracking.
> >
> > Here's how this looks in my mind. It's quite similar, except that
> > ALLOW_READ, ALLOW_WRITE, and ALLOW_EXEC are replaced with a little
> > state machine.
> >
> > EADD does not take any special flags. It calls this LSM hook:
> >
> > int security_enclave_load(struct vm_area_struct *source);
> >
> > This hook can return -EPERM. Otherwise it 0 or
> > ALLOC_EXEC_IF_UNMODIFIED (i.e. 1). This hook enforces permissions (a)
> and (b).
> >
> > The driver tracks a state for each page, and the possible states are:
> >
> > - CLEAN_MAYEXEC /* no W or X VMAs have existed, but X is okay */
> > - CLEAN_NOEXEC /* no W or X VMAs have existed, and X is not okay */
> > - CLEAN_EXEC /* no W VMA has existed, but an X VMA has existed */
> > - DIRTY /* a W VMA has existed */
> >
> > The initial state for a page is CLEAN_MAYEXEC if the hook said
> > ALLOW_EXEC_IF_UNMODIFIED and CLEAN_NOEXEC otherwise.
> >
> > The future EAUG does not call a hook at all and puts pages into the
> > state CLEAN_NOEXEC. If SGX3 or later ever adds EAUG-but-don't-clear,
> > it can call security_enclave_load() and add CLEAN_MAYEXEC pages if
> appropriate.
> >
> > EINIT takes a sigstruct pointer. SGX calls a new hook:
> >
> > unsigned int security_enclave_init(struct sigstruct *sigstruct,
> > struct vm_area_struct *source, unsigned int flags);
> >
> > This hook can return -EPERM. Otherwise it returns 0 or a combination
> > of flags DENY_WX and DENY_X_DIRTY. The driver saves this value.
> > These represent permissions (c) and (d).
> >
> > If we want to have a permission for "execute code supplied from
> > outside the enclave that was not measured", we could have a flag like
> > HAS_UNMEASURED_CLEAN_EXEC_PAGE that the LSM could consider.
> >
> > mmap() and mprotect() enforce the following rules:
> >
> > - If VM_EXEC is requested and (either the page is DIRTY or VM_WRITE
> is
> > requested) and DENY_X_DIRTY, then deny.
> >
> > - If VM_WRITE and VM_EXEC are both requested and DENY_WX, then deny.
> >
> > - If VM_WRITE is requested, we need to update the state. If it was
> > CLEAN_EXEC, then we reject if DENY_X_DIRTY. Otherwise we change
> the
> > state to DIRTY.
> >
> > - If VM_EXEC is requested and the page is CLEAN_NOEXEC, then deny.
> >
> > mprotect() and mmap() do *not* call SGX-specific LSM hooks to ask for
> > permission, although they can optionally call an LSM hook if they hit
> > one of the -EPERM cases for auditing purposes.
> >
> > Before the SIGSTRUCT is provided to the driver, the driver acts as
> > though DENY_X_DIRTY and DENY_WX are both set.
I think we've been discussing 2 topics simultaneously, one is the state machine that accepts/rejects mmap/mprotect requests, while the other is where is the best place to put it. I think we have an agreement on the former, and IMO option #2 and #3 differ only in the latter.
Option #2 keeps the state machine inside SGX subsystem, so it could reuse existing data structures for page tracking/locking to some extent. Sean may have smarter ideas, but it looks to me like the existing 'struct sgx_encl_page' tracks individual enclave pages while the FSM states apply to ranges. So in order *not* to test page by page in mmap/mprotect, I guess some new range oriented structures are still necessary. But I don't think it very important anyway.
My major concern is more from the architecture/modularity perspective. Specifically, the state machine is defined by LSM but SGX does the state transitions. That's a brittle relationship that'd break easily if the state machine changes in future, or if different LSM modules want to define different FSMs (comprised of different set of states and/or triggers). After all, what's needed by the SGX subsystem is just the decision, not the FSM definition. I think we should take a closer look at this area once Sean's patch comes out.