Re: [RFC v2-fix-v2 1/1] x86/boot: Avoid #VE during boot for TDX platforms
From: Sean Christopherson
Date: Fri May 21 2021 - 14:18:52 EST
On Fri, May 21, 2021, Dave Hansen wrote:
> > + /*
> > + * Preserve current value of EFER for comparison and to skip
> > + * EFER writes if no change was made (for TDX guest)
> > + */
> > + movl %eax, %edx
> > btsl $_EFER_SCE, %eax /* Enable System Call */
> > btl $20,%edi /* No Execute supported? */
> > jnc 1f
> > btsl $_EFER_NX, %eax
> > btsq $_PAGE_BIT_NX,early_pmd_flags(%rip)
> > -1: wrmsr /* Make changes effective */
> >
> > + /* Avoid writing EFER if no change was made (for TDX guest) */
> > +1: cmpl %edx, %eax
> > + je 1f
> > + xor %edx, %edx
> > + wrmsr /* Make changes effective */
> > +1:
>
> Just curious, but what if this goes wrong? Say the TDX firmware didn't
> set up EFER correctly and this code does the WRMSR.
By firmware, do you mean TDX-module, or guest firmware? EFER is read-only in a
TDX guest, i.e. the guest firmware can't change it either.
> What ends up happening? Do we get anything out on the console, or is it
> essentially undebuggable?
Assuming "firmware" means TDX-module, if TDX-Module botches EFER (and only EFER)
then odds are very, very good that the guest will never get to the kernel as it
will have died long before in guest BIOS.
If the bug is such that EFER is correct in hardware, but RDMSR returns the wrong
value (due to MSR interception), IIRC this will triple fault and so nothing will
get logged. But, the odds of that type of bug being hit in production are
practically zero because the EFER setup is very static, i.e. any such bug should
be hit during qualification of the VMM+TDX-Module.
In any case, even if a bug escapes, the shutdown is relatively easy to debug even
without logs because the failure will cleary point at the WRMSR (that info can be
had by running a debug TD or a debug TDX-Module). By TDX standards, debugging
shutdowns on a specific instruction is downright trivial :-).