Re: [PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

From: Jarkko Sakkinen
Date: Mon Apr 29 2024 - 09:12:44 EST


On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote:
> Two enclave threads may try to add and remove the same enclave page
> simultaneously (e.g., if the SGX runtime supports both lazy allocation
> and `MADV_DONTNEED` semantics). Consider this race:
>
> 1. T1 performs page removal in sgx_encl_remove_pages() and stops right
> after removing the page table entry and right before re-acquiring the
> enclave lock to EREMOVE and xa_erase(&encl->page_array) the page.
> 2. T2 tries to access the page, and #PF[not_present] is raised. The
> condition to EAUG in sgx_vma_fault() is not satisfied because the
> page is still present in encl->page_array, thus the SGX driver
> assumes that the fault happened because the page was swapped out. The
> driver continues on a code path that installs a page table entry
> *without* performing EAUG.
> 3. The enclave page metadata is in inconsistent state: the PTE is
> installed but there was no EAUG. Thus, T2 in userspace infinitely
> receives SIGSEGV on this page (and EACCEPT always fails).
>
> Fix this by making sure that T1 (the page-removing thread) always wins
> this data race. In particular, the page-being-removed is marked as such,
> and T2 retries until the page is fully removed.
>
> Fixes: 9849bb27152c ("x86/sgx: Support complete page removal")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/sgx/encl.c | 3 ++-
> arch/x86/kernel/cpu/sgx/encl.h | 3 +++
> arch/x86/kernel/cpu/sgx/ioctl.c | 1 +
> 3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 41f14b1a3025..7ccd8b2fce5f 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -257,7 +257,8 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
>
> /* Entry successfully located. */
> if (entry->epc_page) {
> - if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
> + if (entry->desc & (SGX_ENCL_PAGE_BEING_RECLAIMED |
> + SGX_ENCL_PAGE_BEING_REMOVED))
> return ERR_PTR(-EBUSY);
>
> return entry;
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index f94ff14c9486..fff5f2293ae7 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -25,6 +25,9 @@
> /* 'desc' bit marking that the page is being reclaimed. */
> #define SGX_ENCL_PAGE_BEING_RECLAIMED BIT(3)
>
> +/* 'desc' bit marking that the page is being removed. */
> +#define SGX_ENCL_PAGE_BEING_REMOVED BIT(2)
> +
> struct sgx_encl_page {
> unsigned long desc;
> unsigned long vm_max_prot_bits:8;
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index b65ab214bdf5..c542d4dd3e64 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -1142,6 +1142,7 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
> * Do not keep encl->lock because of dependency on
> * mmap_lock acquired in sgx_zap_enclave_ptes().
> */
> + entry->desc |= SGX_ENCL_PAGE_BEING_REMOVED;
> mutex_unlock(&encl->lock);
>
> sgx_zap_enclave_ptes(encl, addr);

It is somewhat trivial to NAK this as the commit message does
not do any effort describing the new flag. By default at least
I have strong opposition against any new flags related to
reclaiming even if it needs a bit of extra synchronization
work in the user space.

One way to describe concurrency scenarios would be to take
example from https://www.kernel.org/doc/Documentation/memory-barriers.txt

I.e. see the examples with CPU 1 and CPU 2.

BR, Jarkko