Re: [PATCH v6 06/11] x86/traps: Add #VE support for TDX guest

From: Dave Hansen
Date: Tue Sep 28 2021 - 11:23:23 EST

On 9/3/21 10:28 AM, Kuppuswamy Sathyanarayanan wrote:
> Virtualization Exceptions (#VE) are delivered to TDX guests due to
> specific guest actions which may happen in either user space or the kernel:
>  * Specific instructions (WBINVD, for example)
>  * Specific MSR accesses
>  * Specific CPUID leaf accesses
>  * Access to TD-shared memory, which includes MMIO
> In the settings that Linux will run in, virtual exceptions are never

^ virtualization

> generated on accesses to normal, TD-private memory that has been
> accepted.

We've gone over this at least half a dozen times. Sathya, please add
this to your cover letter and also to the TDX documentation if it's not
there already:

In the settings that Linux will run in, virtualization exceptions are
never generated on accesses to normal kernel memory (see #VE on Memory
Access below).


== #VE on Memory Accesses ==

A TD guest is in control of whether its memory accesses are treated as
private or shared. It selects the behavior with a bit in its page table

=== #VE on Shared Pages ===

Accesses to shared mappings can cause #VE's. The hypervisor is in
control of when a #VE might occur, so the guest must be careful to only
reference shared pages when it is in a context that can safely handle a #VE.

However, shared mapping content can not be trusted since shared page
content is writable by the hypervisor. This means that shared mappings
are never used for sensitive memory contents like stacks or kernel text.
This means that the shared mapping property of inducing #VEs requires
essentially no special kernel handling in sensitive contexts like
syscall entry or NMIs.

=== #VE on Private Pages ===

Some accesses to private mappings may cause #VEs. Before a mapping is
accepted (aka. in the SEPT_PENDING state), a reference would cause
a #VE. But, after acceptance, references typically succeed.

The hypervisor can cause a private page reference to fail if it chooses
to move an accepted page to a "blocked" state. However, if it does
this, a page access will not generate a #VE. It will, instead, cause a
"TD Exit" where the hypervisor is required to handle the exception.