Re: [PATCH v3 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

From: Borislav Petkov
Date: Mon Mar 22 2021 - 18:38:35 EST


On Tue, Mar 23, 2021 at 11:06:43AM +1300, Kai Huang wrote:
> This path is called by host SGX driver only, so yes this leaking is done by
> host enclaves only.

Yes, so I was told.

> This patch is purpose is to break EREMOVE out of sgx_free_epc_page() so virtual
> EPC code can use sgx_free_epc_page(), and handle EREMOVE logic differently.
> This patch doesn't have intention to bring functional change. I changed the
> error msg because Dave said it worth to give a message saying EPC page is
> leaked, and I thought this minor change won't break anything.

I read that already - you don't have to repeat it.

> btw, currently virtual EPC code (patch 5) handles in similar way: There's one
> EREMOVE error is expected and virtual EPC code can handle, but for other
> errors, it means kernel bug, and virtual EPC code gives a WARN(), and that EPC
> page is leaked too:
>
> + WARN_ONCE(ret != SGX_CHILD_PRESENT,
> + "EREMOVE (EPC page 0x%lx): unexpected error: %d\n",
> + sgx_get_epc_phys_addr(epc_page), ret);
> + return ret;
>
> So to me they are just WARN() to catch kernel bug.

You don't care about users, do you? Because when that error happens,
they won't come crying to you to fix it.

Lemme save you some trouble: we don't take half-baked code into the
kernel until stuff has been discussed and analyzed properly. So instead
of trying to downplay this, try answering my questions.

Here's another one: when does EREMOVE fail?

/me goes and looks it up

"The instruction fails if the operand is not properly aligned or does
not refer to an EPC page or the page is in use by another thread, or
other threads are running in the enclave to which the page belongs. In
addition the instruction fails if the operand refers to an SECS with
associations."

And I guess those conditions will become more in the future.

Now, let's play. I'm the cloud admin and you're cloud OS customer
support. I say:

"I got this scary error message while running enclaves on my server

"EREMOVE returned ... . EPC page leaked. Reboot required to retrieve leaked pages."

but I cannot reboot that machine because there are guests running on it
and I'm getting paid for those guests and I might get sued if I do?"

Your turn, go wild.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette