Re: [PATCH v3 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

From: Borislav Petkov
Date: Mon Mar 22 2021 - 14:17:34 EST


On Fri, Mar 19, 2021 at 08:22:19PM +1300, Kai Huang wrote:
> +/**
> + * sgx_encl_free_epc_page - free EPC page assigned to an enclave
> + * @page: EPC page to be freed
> + *
> + * Free EPC page assigned to an enclave. It does EREMOVE for the page, and
> + * only upon success, it puts the page back to free page list. Otherwise, it
> + * gives a WARNING to indicate page is leaked, and require reboot to retrieve
> + * leaked pages.
> + */
> +void sgx_encl_free_epc_page(struct sgx_epc_page *page)
> +{
> + int ret;
> +
> + WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> +
> + /*
> + * Give a message to remind EPC page is leaked when EREMOVE fails,
> + * and requires machine reboot to get leaked pages back. This can
> + * be improved in future by adding stats of leaked pages, etc.
> + */
> +#define EREMOVE_ERROR_MESSAGE \
> + "EREMOVE returned %d (0x%x). EPC page leaked. Reboot required to retrieve leaked pages."

A reboot? Seriously? Why?

How are you going to explain to cloud people that they need to reboot
their fat server? The same cloud people who want to make sure Intel
supports late microcode loading no matter the effort just so to avoid
rebooting the machine.

But now all of a sudden, if they wanna have SGX enclaves in guests, they
need to get prepared for potential rebooting.

I sure hope I'm missing something...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette