Re: [PATCH v30 12/20] x86/sgx: Add a page reclaimer

From: Sean Christopherson
Date: Fri May 22 2020 - 02:58:04 EST


On Fri, May 15, 2020 at 03:44:02AM +0300, Jarkko Sakkinen wrote:
> +/**
> + * sgx_reclaim_pages() - Reclaim EPC pages from the consumers
> + *
> + * Take a fixed number of pages from the head of the active page pool and
> + * reclaim them to the enclave's private shmem files. Skip the pages, which
> + * have been accessed since the last scan. Move those pages to the tail of
> + * active page pool so that the pages get scanned in LRU like fashion.
> + */
> +void sgx_reclaim_pages(void)
> +{
> + struct sgx_epc_page *chunk[SGX_NR_TO_SCAN];
> + struct sgx_backing backing[SGX_NR_TO_SCAN];
> + struct sgx_epc_section *section;
> + struct sgx_encl_page *encl_page;
> + struct sgx_epc_page *epc_page;
> + int cnt = 0;
> + int ret;
> + int i;
> +
> + spin_lock(&sgx_active_page_list_lock);
> + for (i = 0; i < SGX_NR_TO_SCAN; i++) {
> + if (list_empty(&sgx_active_page_list))
> + break;
> +
> + epc_page = list_first_entry(&sgx_active_page_list,
> + struct sgx_epc_page, list);
> + list_del_init(&epc_page->list);
> + encl_page = epc_page->owner;
> +
> + if (kref_get_unless_zero(&encl_page->encl->refcount) != 0)
> + chunk[cnt++] = epc_page;
> + else
> + /* The owner is freeing the page. No need to add the
> + * page back to the list of reclaimable pages.
> + */
> + epc_page->desc &= ~SGX_EPC_PAGE_RECLAIMABLE;
> + }
> + spin_unlock(&sgx_active_page_list_lock);
> +
> + for (i = 0; i < cnt; i++) {
> + epc_page = chunk[i];
> + encl_page = epc_page->owner;
> +
> + if (!sgx_reclaimer_age(epc_page))
> + goto skip;
> +
> + ret = sgx_encl_get_backing(encl_page->encl,
> + SGX_ENCL_PAGE_INDEX(encl_page),
> + &backing[i]);
> + if (ret)
> + goto skip;
> +
> + mutex_lock(&encl_page->encl->lock);
> + encl_page->desc |= SGX_ENCL_PAGE_RECLAIMED;
> + mutex_unlock(&encl_page->encl->lock);
> + continue;
> +
> +skip:
> + kref_put(&encl_page->encl->refcount, sgx_encl_release);
> +
> + spin_lock(&sgx_active_page_list_lock);
> + list_add_tail(&epc_page->list, &sgx_active_page_list);
> + spin_unlock(&sgx_active_page_list_lock);

Ugh, this is wrong. If the above kref_put() drops the last reference and
releases the enclave, adding the page to the active page list will result
in a use-after-free as the enclave will have been freed. It also leaks the
EPC page because sgx_encl_destroy() skips pages that are in the process of
being reclaimed (as detected by list_empty()).

The "original" code did the put() after list_add_tail(), but was moved in
v15 to fix a bug where the put() could drop a reference to the wrong enclave
if the page was freed and reallocated by a different CPU between
list_add_tail() and put(). But, that particular bug only occurred because
the code at the time was:

sgx_encl_page_put(epc_page);

I.e. the backpointer in epc_page was consumed after dropping the spin lock.
So long as epc_page->owner (well, epc_page in general) isn't dereferenced,
I'm 99% certain this can be fixed simply by doing kref_put() after moving
the page back to the active page list.

> +
> + chunk[i] = NULL;
> + }
> +
> + for (i = 0; i < cnt; i++) {
> + epc_page = chunk[i];
> + if (epc_page)
> + sgx_reclaimer_block(epc_page);
> + }
> +
> + for (i = 0; i < cnt; i++) {
> + epc_page = chunk[i];
> + if (!epc_page)
> + continue;
> +
> + encl_page = epc_page->owner;
> + sgx_reclaimer_write(epc_page, &backing[i]);
> + sgx_encl_put_backing(&backing[i], true);
> +
> + kref_put(&encl_page->encl->refcount, sgx_encl_release);
> + epc_page->desc &= ~SGX_EPC_PAGE_RECLAIMABLE;
> +
> + section = sgx_epc_section(epc_page);
> + spin_lock(&section->lock);
> + list_add_tail(&epc_page->list, &section->page_list);
> + section->free_cnt++;
> + spin_unlock(&section->lock);
> + }
> +}