[PATCH 0/2] KVM: kvm-coco-queue: Support protected TSC

From: Isaku Yamahata
Date: Sat Oct 12 2024 - 03:56:22 EST


This patch series is for the kvm-coco-queue branch. The change for TDX KVM is
included at the last. The test is done by create TDX vCPU and run, get TSC
offset via vCPU device attributes and compare it with the TDX TSC OFFSET
metadata. Because the test requires the TDX KVM and TDX KVM kselftests, don't
include it in this patch series.


Background
----------
X86 confidential computing technology defines protected guest TSC so that the
VMM can't change the TSC offset/multiplier once vCPU is initialized and the
guest can trust TSC. The SEV-SNP defines Secure TSC as optional. TDX mandates
it. The TDX module determines the TSC offset/multiplier. The VMM has to
retrieve them.

On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
(kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).


Problem
-------
The current x86 KVM implementation conflicts with protected TSC because the
VMM can't change the TSC offset/multiplier. Disable or ignore the KVM
logic to change/adjust the TSC offset/multiplier somehow.

Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
offset/multiplier, the TSC timer interrupts are injected to the guest at the
wrong time if the KVM TSC offset is different from what the TDX module
determined.

Originally the issue was found by cyclic test of rt-test [1] as the latency in
TDX case is worse than VMX value + TDX SEAMCALL overhead. It turned out that
the KVM TSC offset is different from what the TDX module determines.


Solution
--------
The solution is to keep the KVM TSC offset/multiplier the same as the value of
the TDX module somehow. Possible solutions are as follows.
- Skip the logic
Ignore (or don't call related functions) the request to change the TSC
offset/multiplier.
Pros
- Logically clean. This is similar to the guest_protected case.
Cons
- Needs to identify the call sites.

- Revert the change at the hooks after TSC adjustment
x86 KVM defines the vendor hooks when the TSC offset/multiplier are
changed. The callback can revert the change.
Pros
- We don't need to care about the logic to change the TSC offset/multiplier.
Cons:
- Hacky to revert the KVM x86 common code logic.

Choose the first one. With this patch series, SEV-SNP secure TSC can be
supported.


Patches:
1: Preparation for the next patch
2: Skip the logic to adjust the TSC offset/multiplier in the common x86 KVM logic

[1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git

Changes for TDX KVM

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 8785309ccb46..969da729d89f 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -694,8 +712,6 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu)
vcpu->arch.cr0_guest_owned_bits = -1ul;
vcpu->arch.cr4_guest_owned_bits = -1ul;

- vcpu->arch.tsc_offset = kvm_tdx->tsc_offset;
- vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset;
/*
* TODO: support off-TD debug. If TD DEBUG is enabled, guest state
* can be accessed. guest_state_protected = false. and kvm ioctl to
@@ -706,6 +722,13 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu)
*/
vcpu->arch.guest_state_protected = true;

+ /* VMM can't change TSC offset/multiplier as TDX module manages them. */
+ vcpu->arch.guest_tsc_protected = true;
+ vcpu->arch.tsc_offset = kvm_tdx->tsc_offset;
+ vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset;
+ vcpu->arch.tsc_scaling_ratio = kvm_tdx->tsc_multiplier;
+ vcpu->arch.l1_tsc_scaling_ratio = kvm_tdx->tsc_multiplier;
+
if ((kvm_tdx->xfam & XFEATURE_MASK_XTILE) == XFEATURE_MASK_XTILE)
vcpu->arch.xfd_no_write_intercept = true;

@@ -2674,6 +2697,7 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
goto out;

kvm_tdx->tsc_offset = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_OFFSET);
+ kvm_tdx->tsc_multiplier = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_MULTIPLIER);
kvm_tdx->attributes = td_params->attributes;
kvm_tdx->xfam = td_params->xfam;

diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
index 614b1c3b8483..c0e4fa61cab1 100644
--- a/arch/x86/kvm/vmx/tdx.h
+++ b/arch/x86/kvm/vmx/tdx.h
@@ -42,6 +42,7 @@ struct kvm_tdx {
bool tsx_supported;

u64 tsc_offset;
+ u64 tsc_multiplier;

enum kvm_tdx_state state;

diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h
index 861c0f649b69..be4cf65c90a8 100644
--- a/arch/x86/kvm/vmx/tdx_arch.h
+++ b/arch/x86/kvm/vmx/tdx_arch.h
@@ -69,6 +69,7 @@

enum tdx_tdcs_execution_control {
TD_TDCS_EXEC_TSC_OFFSET = 10,
+ TD_TDCS_EXEC_TSC_MULTIPLIER = 11,
};

enum tdx_vcpu_guest_other_state {

---
Isaku Yamahata (2):
KVM: x86: Push down setting vcpu.arch.user_set_tsc
KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change

arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/x86.c | 21 ++++++++++++++-------
2 files changed, 15 insertions(+), 7 deletions(-)


base-commit: 909f9d422f59f863d7b6e4e2c6e57abb97a27d4d
--
2.45.2