Re: [PATCH v17 18/23] platform/x86: Intel SGX driver
From: Sean Christopherson
Date: Mon Dec 17 2018 - 13:10:00 EST
On Mon, Dec 17, 2018 at 07:49:35PM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 17, 2018 at 09:31:06AM -0800, Sean Christopherson wrote:
> > This doesn't work as-is. sgx_encl_release() needs to use sgx_free_page()
> > and not __sgx_free_page() so that we get a WARN() if the page can't be
> > freed. sgx_invalidate() needs to use __sgx_free_page() as freeing a page
> > can fail due to running concurrently with reclaim. I'll play around with
> > the code a bit, there's probably a fairly clean way to share code between
> > the two flows.
> Hmm... but why issue a warning in that case? It should be legit
No, EREMOVE should never fail if the enclave is being released, i.e. all
references to the enclave are gone. And failure during sgx_encl_release()
means we leaked an EPC page, which warrants a WARN.
The only legitimate reason __sgx_free_page() can fail in sgx_invalidate()
is because a page might be in the process of being reclaimed. We could
theoretically WARN on EREMOVE failure in that case, but it'd make the code
a little fragile and it's not "fatal" in the sense that we get a second
chance to free the page during sgx_encl_release().
And unless I missed something, using sgx_invalidate() means were' leaking
all sgx_encl_page structs as well as the radix tree entries.
> > sgx_encl_release_worker() calls do_unmap() without checking the validity
> > of the page tables. As is, the code doesn't even guarantee mm_struct
> > itself is valid.
> > The easiest fix I can think of is to add a SGX_ENCL_MM_RELEASED flag
> > that is set along with SGX_ENCL_DEAD in sgx_mmu_notifier_release(), and
> > only call do_unmap() if SGX_ENCL_MM_RELEASED is false. Note that this
> > means we cant unregister the mmu_notifier until after do_unmap(), but
> > that's true no matter what since we're relying on the mmu_notifier to
> > hold a reference to mm_struct. Patch attached.
> OK, the fix change makes sense but I'm thinking that would it be a
> better idea just to set mm NULL and check that instead?
That makes sense.