Re: [PATCH v33 03/21] x86/mm: x86/sgx: Signal SIGSEGV with PF_SGX

From: Borislav Petkov
Date: Thu Jun 25 2020 - 04:59:42 EST


On Thu, Jun 18, 2020 at 01:08:25AM +0300, Jarkko Sakkinen wrote:
> From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>
> Include SGX bit to the PF error codes and throw SIGSEGV with PF_SGX when
> a #PF with SGX set happens.
>
> CPU throws a #PF with the SGX bit in the event of Enclave Page Cache Map
^
set

> (EPCM) conflict. The EPCM is a CPU-internal table, which describes the
> properties for a enclave page. Enclaves are measured and signed software
> entities, which SGX hosts. [1]
>
> Although the primary purpose of the EPCM conflict checks is to prevent
> malicious accesses to an enclave, an illegit access can happen also for
> legit reasons.
>
> All SGX reserved memory, including EPCM is encrypted with a transient
> key that does not survive from the power transition. Throwing a SIGSEGV
> allows user space software react when this happens (e.g. rec-create the
^
to recreate

> enclave, which was invalidated).
>
> [1] Intel SDM: 36.5.1 Enclave Page Cache Map (EPCM)
>
> Acked-by: Jethro Beekman <jethro@xxxxxxxxxxxx>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx>
> ---
> arch/x86/include/asm/traps.h | 1 +
> arch/x86/mm/fault.c | 13 +++++++++++++
> 2 files changed, 14 insertions(+)
>
> diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
> index 714b1a30e7b0..ee3617b67bf4 100644
> --- a/arch/x86/include/asm/traps.h
> +++ b/arch/x86/include/asm/traps.h
> @@ -58,5 +58,6 @@ enum x86_pf_error_code {
> X86_PF_RSVD = 1 << 3,
> X86_PF_INSTR = 1 << 4,
> X86_PF_PK = 1 << 5,
> + X86_PF_SGX = 1 << 15,

Needs to be added to the doc above it.

> #endif /* _ASM_X86_TRAPS_H */
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 66be9bd60307..25d48aae36c1 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1055,6 +1055,19 @@ access_error(unsigned long error_code, struct vm_area_struct *vma)
> if (error_code & X86_PF_PK)
> return 1;
>
> + /*
> + * Access is blocked by the Enclave Page Cache Map (EPCM), i.e. the
> + * access is allowed by the PTE but not the EPCM. This usually happens
> + * when the EPCM is yanked out from under us, e.g. by hardware after a
> + * suspend/resume cycle. In any case, software, i.e. the kernel, can't
> + * fix the source of the fault as the EPCM can't be directly modified by
> + * software. Handle the fault as an access error in order to signal
> + * userspace so that userspace can rebuild their enclave(s), even though
> + * userspace may not have actually violated access permissions.
> + */

Lemme check whether I understand this correctly: userspace must check
whether the SIGSEGV is generated on an access to an enclave page?

Also, do I see it correctly that when this happens, dmesg will have

printk("%s%s[%d]: segfault at %lx ip %px sp %px error %lx",

due to:

if (likely(show_unhandled_signals))
show_signal_msg(regs, error_code, address, tsk);

which does:

if (!unhandled_signal(tsk, SIGSEGV))
return;

or is the task expected to register a SIGSEGV handler so that the
segfault doesn't land in dmesg?

If so, are we documenting this?

If not, then we should not issue any "segfault" messages to dmesg
because that would be wrong.

Or maybe I'm not seeing it right but I don't have the hardware to test
this out...

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette