Re: [RFC v2-fix-v4 1/1] x86/tdx: Skip WBINVD instruction for TDX guest

From: Kuppuswamy, Sathyanarayanan
Date: Wed Jun 09 2021 - 13:28:25 EST




On 6/9/21 9:12 AM, Andy Lutomirski wrote:
On 6/9/21 8:09 AM, Dan Williams wrote:
On Tue, Jun 8, 2021 at 9:27 PM Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote:


here is no resume path.

Host is free to go into S3 independent of any guest state.

Actually my understanding is that none of the systems which support TDX
support S3. S3 has been deprecated for a long time.

Ok, I wanted to imply any power state that might power-off caches.



A hostile
host is free to do just enough cache management so that it can resume
from S3 while arranging for TDX guest dirty data to be lost. Does a
TDX guest go fatal if the cache loses power?

That would be a machine check, and yes it would be fatal.

Sounds good, so incorporating this and Andy's feedback:

"TDX guests, like other typical guests, use standard ACPI mechanisms
to signal sleep state entry (including reboot) to the host. The ACPI
specification mandates WBINVD on any sleep state entry with the
expectation that the platform is only responsible for maintaining the
state of memory over sleep states, not preserving dirty data in any
CPU caches. ACPI cache flushing requirements pre-date the advent of
virtualization. Given guest sleep state entry does not affect any host
power rails it is not required to flush caches. The host is
responsible for maintaining cache state over its own bare metal sleep
state transitions that power-off the cache. A TDX guest, unlike a
typical guest, will machine check if the CPU cache is powered off."

Andi, is that machine check behavior relative to power states
mentioned in the docs?

I don't think there's anything about power states. There is a general
documented mechanism to integrity-check TD guest memory, but it is *not*
replay-resistant. So, if the guest dirties a cache line, and the cache
line is lost, it seems entirely plausible that the guest would get
silently corrupted.

I would argue that, if this happens, it's a host, TD module, or
architecture bug, and it's not the guest's fault.

If you want to apply this fix for all hypervisors (using boot_cpu_has
(X86_FEATURE_HYPERVISOR) check), then we don't need any TDX specific
reference in commit log right? It can be generalized for all VM guests.

agree?


--Andy


--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer