Re: [RFC v2-fix-v2 1/1] x86/boot: Avoid #VE during boot for TDX platforms

From: Dave Hansen
Date: Fri May 21 2021 - 14:30:33 EST


On 5/21/21 11:18 AM, Sean Christopherson wrote:
> On Fri, May 21, 2021, Dave Hansen wrote:
>>> + /*
>>> + * Preserve current value of EFER for comparison and to skip
>>> + * EFER writes if no change was made (for TDX guest)
>>> + */
>>> + movl %eax, %edx
>>> btsl $_EFER_SCE, %eax /* Enable System Call */
>>> btl $20,%edi /* No Execute supported? */
>>> jnc 1f
>>> btsl $_EFER_NX, %eax
>>> btsq $_PAGE_BIT_NX,early_pmd_flags(%rip)
>>> -1: wrmsr /* Make changes effective */
>>>
>>> + /* Avoid writing EFER if no change was made (for TDX guest) */
>>> +1: cmpl %edx, %eax
>>> + je 1f
>>> + xor %edx, %edx
>>> + wrmsr /* Make changes effective */
>>> +1:
>>
>> Just curious, but what if this goes wrong? Say the TDX firmware didn't
>> set up EFER correctly and this code does the WRMSR.
>
> By firmware, do you mean TDX-module, or guest firmware? EFER is read-only in a
> TDX guest, i.e. the guest firmware can't change it either.

I guess I was assuming that the trusted BIOS was going to do the setup
of EFER before it hands control over to the kernel. So, I *meant* the BIOS.

But, I see from below that it's probably the TDX-module that's
responsible for this behavior.

>> What ends up happening? Do we get anything out on the console, or is it
>> essentially undebuggable?
>
> Assuming "firmware" means TDX-module, if TDX-Module botches EFER (and only EFER)
> then odds are very, very good that the guest will never get to the kernel as it
> will have died long before in guest BIOS.
>
> If the bug is such that EFER is correct in hardware, but RDMSR returns the wrong
> value (due to MSR interception), IIRC this will triple fault and so nothing will
> get logged. But, the odds of that type of bug being hit in production are
> practically zero because the EFER setup is very static, i.e. any such bug should
> be hit during qualification of the VMM+TDX-Module.
>
> In any case, even if a bug escapes, the shutdown is relatively easy to debug even
> without logs because the failure will cleary point at the WRMSR (that info can be
> had by running a debug TD or a debug TDX-Module). By TDX standards, debugging
> shutdowns on a specific instruction is downright trivial :-).

That sounds sane to me. It would be nice to get this into the
changelog. Perhaps:

This theoretically makes guest boot more fragile. If, for
instance, EER was set up incorrectly and a WRMSR was performed,
the resulting (unhandled) #VE would triple fault. However, this
is likely to trip up the guest BIOS long before control reaches
the kernel. In any case, these kinds of problems are unlikely
to occur in production environments, and developers have good
debug tools to fix them quickly.

That would put my mind at ease a bit.