Re: [PATCH v6 11/11] x86/tdx: Handle CPUID via #VE

From: Dave Hansen
Date: Fri Sep 03 2021 - 14:35:54 EST


On 9/3/21 10:28 AM, Kuppuswamy Sathyanarayanan wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
>
> TDX has three classes of CPUID leaves: some CPUID leaves are always
> handled by the CPU, others are handled by the TDX module, and some
> others are handled by the VMM. Since the VMM cannot directly intercept
> the instruction these are reflected with a #VE exception to the guest,
> which then converts it into a hypercall to the VMM, or handled
> directly.

Does this patch do any of the "handled directly" leaves? If not, why
mention it?

It would also be nice to mention that this applies to both kernel and
userspace use of CPUID. It talks a bit about early kernel use, which
makes it seem like this is kernel-only.

I also think it's a mistake to talk about TDX-module handling. For
*this* patch, it doesn't matter.

Here's a reformatted replacement changelog:

--

When running virtualized, the CPUID instruction is handled differently
based on the leaf being accessed. The behavior depends only on on the
leaf and applies equally to both kernel/ring-0 and userspace/ring-3
execution of CPUID. Historically, there are two basic classes:

* Leaves handled transparently to the guest
* Leaves handled by the VMM

In a typical guest without TDX, "handled by the VMM" leaves cause a
VMEXIT. TDX replaces these VMEXITs with a #VE exception in the guest.
The guest typically handles the #VE by making a hypercall to the VMM.

The TDX module spec talks about a few more classes of CPUID handling.
But, for the purposes of this patch, the "handled transparently" CPUID
leaves are all lumped together because the guest handling is the same.

--

> The TDX module specification [1], sec 16.2 has a full list of CPUID

^ I think we can spare the extra four bytes to make "sec" ->
"section".

I also opened up the pdf from [1] an searched for "16.2". I found:

16.2. Branch Prediction Side Channel Attacks Mitigation
Mechanisms

There is, however, a:

18.2. CPUID Virtualization

section. Did you, perhaps, mean to reference that instead?

Which kinda argues for not using these section numbers at *all*.
Perhaps you should just mention the section titles, as they're obviously
less volatile. That's probably a comment that applies to *ALL* of your
changelogs across *ALL* TDX patches. This just proves that the section
numbers are worthless.

> leaves which are handled natively or by the TDX module. Only unknown

Just in terms of nice writing, it would be great to use the same
language when you refer to the same concept. Earlier you called this
"handled by the CPU". But, here you refer to it as being "handled
natively". Neither is wrong, but this puts a burden on the reader to
make a connection rather than doing it for them as the writer.

> CPUIDs are handled by the #VE method. In practice this typically only
> applies to the hypervisor-specific CPUIDs unknown to the native CPU.
>
> Therefore there is no risk of causing this in early CPUID code which
> runs before the #VE handler is set up because it will never access
> those exotic CPUID leaves.

This never actually makes a transition from background to telling what
the patch does. I think this needs at least this:

Allow the the #VE code to handle any CPUID leaves which cause a
#VE. Unconditionally make a TDCALL to the VMM.

> diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
> index 5c52dde4a5fd..c65c117aff5f 100644
> --- a/arch/x86/kernel/tdx.c
> +++ b/arch/x86/kernel/tdx.c
> @@ -150,6 +150,21 @@ static int tdx_write_msr_safe(unsigned int msr, unsigned int low,
> return ret ? -EIO : 0;
> }
>
> +static u64 tdx_handle_cpuid(struct pt_regs *regs)
> +{
> + struct tdx_hypercall_output out = {0};
> + u64 ret;
> +
> + ret = _tdx_hypercall(EXIT_REASON_CPUID, regs->ax, regs->cx, 0, 0, &out);
> +
> + regs->ax = out.r12;
> + regs->bx = out.r13;
> + regs->cx = out.r14;
> + regs->dx = out.r15;

This probably needs a comment about why this is shuffling registers
around like this.

> + return ret;
> +}
> +
> unsigned long tdx_get_ve_info(struct ve_info *ve)
> {
> struct tdx_module_output out = {0};
> @@ -193,6 +208,9 @@ int tdx_handle_virtualization_exception(struct pt_regs *regs,
> case EXIT_REASON_MSR_WRITE:
> ret = tdx_write_msr_safe(regs->cx, regs->ax, regs->dx);
> break;
> + case EXIT_REASON_CPUID:
> + ret = tdx_handle_cpuid(regs);
> + break;
> default:
> pr_warn("Unexpected #VE: %lld\n", ve->exit_reason);
> return -EFAULT;
>