Re: [RFC v2-fix-v5 1/1] x86: Skip WBINVD instruction for VM guest
From: Dave Hansen
Date: Wed Jun 09 2021 - 17:03:24 EST
This changelog lacks both clear problem statements and a clear solution
implemented within the patch.
Here's a proposed changelog. It clearly spells out the two problems
caused by WBINVD within a guest, and the proposed solution which fixes
those two problems.
Is this missing anything?
--
VM guests that support ACPI use standard ACPI mechanisms to signal sleep
state entry to the host. To ACPI, reboot is simply another sleep state.
ACPI specifies that the platform preserve memory contents over (some)
sleep states. It does not specify any requirements for data
preservation in CPU caches. The ACPI specification mandates the use of
WBINVD to flush the contents of the CPU caches to memory before entering
specific sleep states, thus ensuring data in caches can survive sleep
state transitions.e
Unlike when entering sleep states bare metal, no actions within a guest
can cause data in processor caches to be lost. That makes these WBINVD
invocations harmless but superfluous within a guest. (<--- problem #1)
In TDX guests, these WBINVD operations cause #VE exceptions. For debug,
it would be ideal for the #VE handler to be able to WARN() when an
unexpected WBINVD occurs. (<--- problem #2)
Avoid WBINVD for all ACPI cache-flushing operations which occur while
running under a hypervisor, which includes TDX guests. This both avoids
TDX warnings and optimizes away superfluous WBINVD invocations. (<----
solution)