Re: [PATCH v3 1/5] x86/sgx: Fix a resource leak in sgx_init()

From: Sean Christopherson
Date: Wed Mar 10 2021 - 10:50:25 EST


On Wed, Mar 10, 2021, Jarkko Sakkinen wrote:
> On Wed, Mar 03, 2021 at 08:56:52AM -0800, Dave Hansen wrote:
> > On 3/3/21 7:03 AM, Jarkko Sakkinen wrote:
> > > If sgx_page_cache_init() fails in the middle, a trivial return
> > > statement causes unused memory and virtual address space reserved for
> > > the EPC section, not freed. Fix this by using the same rollback, as
> > > when sgx_page_reclaimer_init() fails.
> > ...
> > > @@ -708,8 +708,10 @@ static int __init sgx_init(void)
> > > if (!cpu_feature_enabled(X86_FEATURE_SGX))
> > > return -ENODEV;
> > >
> > > - if (!sgx_page_cache_init())
> > > - return -ENOMEM;
> > > + if (!sgx_page_cache_init()) {
> > > + ret = -ENOMEM;
> > > + goto err_page_cache;
> > > + }
> >
> >
> > Currently, the only way sgx_page_cache_init() can fail is in the case
> > that there are no sections:
> >
> > if (!sgx_nr_epc_sections) {
> > pr_err("There are zero EPC sections.\n");
> > return false;
> > }
> >
> > That only happened if all sgx_setup_epc_section() calls failed.
> > sgx_setup_epc_section() never both allocates memory with vmalloc for
> > section->pages *and* fails. If sgx_setup_epc_section() has a successful
> > memremap() but a failed vmalloc(), it cleans up with memunmap().
> >
> > In other words, I see how this _looks_ like a memory leak from
> > sgx_init(), but I don't see an actual leak in practice.
> >
> > Am I missing something?
>
> In sgx_setup_epc_section():
>
>
> section->pages = vmalloc(nr_pages * sizeof(struct sgx_epc_page));
> if (!section->pages) {
> memunmap(section->virt_addr);
> return false;
> }
>
> I.e. this rollback does not happen without this fix applied:
>
> for (i = 0; i < sgx_nr_epc_sections; i++) {
> vfree(sgx_epc_sections[i].pages);
> memunmap(sgx_epc_sections[i].virt_addr);
> }

Dave is pointing out that sgx_page_cache_init() fails if and only if _all_
sections fail sgx_setup_epc_section(), and if all sections fail then
sgx_nr_epc_sections is '0' and the above is a nop.

That behavior is by design, as we didn't want to kill SGX if a single section
failed to initialize for whatever reason.