Re: [RFC v1 09/26] x86/tdx: Handle CPUID via #VE

From: Sean Christopherson
Date: Mon Feb 08 2021 - 13:57:33 EST


On Sun, Feb 07, 2021, Andy Lutomirski wrote:
>
> > On Feb 7, 2021, at 2:31 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> >
> > On 2/7/21 12:29 PM, Kirill A. Shutemov wrote:
> >>> Couldn't you just have one big helper that takes *all* the registers
> >>> that get used in any TDVMCALL and sets all the rcx bits? The users
> >>> could just pass 0's for the things they don't use.

IIRC, having exactly one helper is a big mess (my original code used a single
main helper). CPUID has a large number of outputs, and because outputs are
handled as pointers, the assembly routine needs to check output params for NULL.

And if we want to write up port I/O directly to TDVMCALL to avoid the #VE, IN
and OUT need separate helpers to implement a non-standard register ABI in order
to play nice with ALTERANTIVES.

This also has my vote, mainly because gcc doesn't allow directly specifying
r8-r15 as register constraints to inline asm. That creates a nasty hole where
a register can get corrupted if code is inserted between writing the local
variable and passing it to the inline asm.

Case in point, patch 08 has this exact bug.

> +static u64 tdx_read_msr_safe(unsigned int msr, int *err)
> +{
> +       register long r10 asm("r10") = TDVMCALL_STANDARD;
> +       register long r11 asm("r11") = EXIT_REASON_MSR_READ;
> +       register long r12 asm("r12") = msr;
> +       register long rcx asm("rcx");
> +       long ret;
> +
> +       WARN_ON_ONCE(tdx_is_context_switched_msr(msr));

This can corrupt r10, r11 and/or r12, e.g. if tdx_is_context_switched_msr() is
not inlined, it can use r10 and r11 as scratch registers, and gcc isn't smart
enough to know it needs to save registers before the call.

Even if the code as committed is guaranteed to work, IMO this approach is
hostile toward future debuggers/developers, e.g. adding a printk() in here to
debug can introduce a completely different failure.

> +
> +       if (msr == MSR_CSTAR)
> +               return 0;
> +
> +       /* Allow to pass R10, R11 and R12 down to the VMM */
> +       rcx = BIT(10) | BIT(11) | BIT(12);
> +
> +       asm volatile(TDCALL
> +                       : "=a"(ret), "=r"(r10), "=r"(r11), "=r"(r12)
> +                       : "a"(TDVMCALL), "r"(rcx), "r"(r10), "r"(r11), "r"(r12)
> +                       : );
> +
> +       /* XXX: Better error handling needed? */
> +       *err = (ret || r10) ? -EIO : 0;
> +
> +       return r11;
> +}

> >>> Then you've got the ugly inline asm in one place. It also makes it
> >>> harder to screw up the 'rcx' mask and end up passing registers you
> >>> didn't want into a malicious VMM.
> >> For now we only pass down R10-R15, but the interface allows to pass down
> >> much wider set of registers, including XMM. How far do we want to get it?
> >
> > Just do what we immediately need: R10-R15
> > .
> >
>
> How much of the register state is revealed to the VMM when we do a TDVMCALL?
> Presumably we should fully sanitize all register state that shows up in
> cleartext on the other end, and we should treat all regs that can be modified
> by the VMM as clobbered.

The guest gets to choose, with a few restrictions. RSP cannot be exposed to the
host. RAX, RCX, R10, and R11 are always exposed as they hold mandatory info
about the TDVMCALL (TDCALL fn, GPR mask, GHCI vs. vendor, and TDVMCALL fn). All
other GPRs are exposed and clobbered if their bit in RCX is set, otherwise they
are saved/restored by the TDX-Module.

I agree with Dave, pass everything required by the GHCI in the main routine, and
sanitize and save/restore all such GPRs.