Re: RFC: userspace exception fixups

From: Rich Felker
Date: Fri Nov 02 2018 - 13:34:36 EST


On Fri, Nov 02, 2018 at 10:16:02AM -0700, Andy Lutomirski wrote:
> On Fri, Nov 2, 2018 at 10:05 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote:
> >
> > On 2018-11-02 10:01, Andy Lutomirski wrote:
> > > On Fri, Nov 2, 2018 at 9:56 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote:
> > >>
> > >> On 2018-11-02 09:52, Sean Christopherson wrote:
> > >>> On Fri, Nov 02, 2018 at 04:37:10PM +0000, Jethro Beekman wrote:
> > >>>> On 2018-11-02 09:30, Sean Christopherson wrote:
> > >>>>> ... The intended convention for EENTER is to have an ENCLU at the AEX target ...
> > >>>>>
> > >>>>> ... to further enforce that the AEX target needs to be ENCLU.
> > >>>>
> > >>>> Some SGX runtimes may want to use a different AEX target.
> > >>>
> > >>> To what end? Userspace gets no indication as to why the AEX occurred.
> > >>> And if exceptions are getting transfered to userspace the trampoline
> > >>> would effectively be handling only INTR, NMI, #MC and EPC #PF.
> > >>>
> > >>
> > >> Various reasons...
> > >>
> > >> Userspace may have established an exception handling convention with the
> > >> enclave (by setting TCS.NSSA > 1) and may want to call EENTER instead of
> > >> ERESUME.
> > >>
> > >
> > > Ugh,
> > >
> > > I sincerely hope that a future ISA extension lets the kernel return
> > > directly back to enclave mode so that AEX events become entirely
> > > invisible to user code.
> >
> > Can you explain how this would work for things like #BR/#DE/#UD that
> > need to be fixed up by code running in the enclave before it can be resumed?
> >
>
> Sure. A better enclave entry function would complete in one of two ways:
>
> 1. The enclave exited normally. Some register output would indicate this.
>
> 2. The enclave existed due to an exception or interrupt. The kernel
> would be entered directly and notified of what happened. The kernel
> would fix it up if needed (#PF), handle an interrupt (for en enclave
> exit due to an interrupt) and reenter the enclave. If, of the error
> is not kernel-fixable-up, it would return back to userspace with some
> explanation of what happened. Kind of like normal user code.
>
> Alternatively, the CPU could directly distinguish between exceptions
> that need the enclave's attention (#BR) and those that don't.
>
> The fact that user code is involved in resuming an enclave when a
> hardware interrupt occurs is silly IMO.

Agreed absolutely. If this is necessary, it seems like there should be
an agreed-upon protocol such that the kernel can make it happen via
returning to code in the vdso that performs the actual resume, so that
the application never sees it.

Rich