Re: [PATCH v2 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS
From: Jarkko Sakkinen
Date: Wed May 15 2024 - 10:28:57 EST
On Wed May 15, 2024 at 5:15 PM EEST, Dave Hansen wrote:
> On 5/15/24 06:54, Jarkko Sakkinen wrote:
> > I'd cut out 90% of the description out and just make the argument of
> > the wrong error code, and done. The sequence is great for showing
> > how this could happen. The prose makes my head hurt tbh.
>
> The changelog is too long, but not fatally so. I'd much rather have a
> super verbose description than something super sparse.
>
> Would something like this make more sense to folks?
>
> Imagine an mmap()'d file. Two threads touch the same address at
> the same time and fault. Both allocate a physical page and race
> to install a PTE for that page. Only one will win the race. The
> loser frees its page, but still continues handling the fault as
> a success and returns VM_FAULT_NOPAGE from the fault handler.
>
> The same race can happen with SGX. But there's a bug: the loser
> in the SGX steers into a failure path. The loser EREMOVE's the
> winner's EPC page, then returns SIGBUS, likely killing the app.
>
> Fix the SGX loser's behavior. Change the return code to
> VM_FAULT_NOPAGE to avoid SIGBUS and call sgx_free_epc_page()
> which avoids EREMOVE'ing the winner's page and only frees the
> page that the loser allocated.
Yes!
I did read the whole thing. My comment was only related to the
chain of maintainers who also have to deal with this patch
eventually.
BR, Jarkko