Re: [RFC v2-fix-v4 1/1] x86/tdx: Skip WBINVD instruction for TDX guest
From: Andy Lutomirski
Date: Wed Jun 09 2021 - 00:02:38 EST
On 6/8/21 8:40 PM, Dan Williams wrote:
> On Tue, Jun 8, 2021 at 6:10 PM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> wrote:
>>
>> Current TDX spec does not have support to emulate the WBINVD
>> instruction. If any feature that uses WBINVD is enabled/used
>> in TDX guest, it will lead to un-handled #VE exception, which
>> will be handled as #GP fault.
>>
>> ACPI drivers also uses WBINVD instruction for cache flushes in
>> reboot or shutdown code path. Since TDX guest has requirement
>> to support shutdown feature, skip WBINVD instruction usage
>> in ACPI drivers for TDX guest.
>
> This sounds awkward...
>
>> Since cache is always coherent in TDX guests, making wbinvd as
>
> This is incorrect, ACPI cache flushing is not about I/O or CPU coherency...
>
>> noop should not cause any issues in above mentioned code path.
>
> ..."should" is a famous last word...
>
>> The end-behavior is the same as KVM guest (treat as noops).
>
> ..."KVM gets away with it" is not a justification that TDX can stand
> on otherwise we would not be here fixing up ACPICA properly.
>
> How about:
>
> "TDX guests use standard ACPI mechanisms to signal sleep state entry
> (including reboot) to the host. The ACPI specification mandates WBINVD
> on any sleep state entry with the expectation that the platform is
> only responsible for maintaining the state of memory over sleep
> states, not preserving dirty data in any CPU caches. ACPI cache
> flushing requirements pre-date the advent of virtualization. Given TDX
> guest sleep state entry does not affect any host power rails it is not
> required to flush caches. The host is responsible for maintaining
> cache state over its own bare metal sleep state transitions that
> power-off the cache. If the host fails to manage caches over its sleep
> state transitions the guest..."
>
I like this description, but shouldn't the logic be:
if (!CPUID has hypervisor bit set)
wbinvd();
As far as I know, most hypervisors will turn WBINVD into a noop and,
even if they don't, it seems to be that something must be really quite
wrong for a guest to need to WBINVD for ACPI purposes.
-Andy