Re: [PATCH v3 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

From: Dave Hansen
Date: Tue May 28 2024 - 12:23:24 EST


On 5/17/24 04:06, Dmitrii Kuvaiskii wrote:
..

First, why is SGX so special here? How is the SGX problem different
than what the core mm code does?

> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -25,6 +25,9 @@
> /* 'desc' bit marking that the page is being reclaimed. */
> #define SGX_ENCL_PAGE_BEING_RECLAIMED BIT(3)
>
> +/* 'desc' bit marking that the page is being removed. */
> +#define SGX_ENCL_PAGE_BEING_REMOVED BIT(2)

Second, convince me that this _needs_ a new bit. Why can't we just have
a bit that effectively means "return EBUSY if you see this bit when
handling a fault".

> struct sgx_encl_page {
> unsigned long desc;
> unsigned long vm_max_prot_bits:8;
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 5d390df21440..de59219ae794 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -1142,6 +1142,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
> * Do not keep encl->lock because of dependency on
> * mmap_lock acquired in sgx_zap_enclave_ptes().
> */
> + entry->desc |= SGX_ENCL_PAGE_BEING_REMOVED;

This also needs a comment, no matter what.