Re: [PATCH 10/25] KVM: TDX: Initialize KVM supported capabilities when module setup

From: Xiaoyao Li
Date: Thu Sep 12 2024 - 05:07:56 EST


On 9/12/2024 4:43 PM, Nikolay Borisov wrote:


On 12.09.24 г. 11:37 ч., Xiaoyao Li wrote:
On 9/12/2024 4:04 PM, Nikolay Borisov wrote:


On 5.09.24 г. 16:36 ч., Xiaoyao Li wrote:
On 9/4/2024 7:58 PM, Nikolay Borisov wrote:


On 13.08.24 г. 1:48 ч., Rick Edgecombe wrote:
From: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>

While TDX module reports a set of capabilities/features that it
supports, what KVM currently supports might be a subset of them.
E.g., DEBUG and PERFMON are supported by TDX module but currently not
supported by KVM.

Introduce a new struct kvm_tdx_caps to store KVM's capabilities of TDX.
supported_attrs and suppported_xfam are validated against fixed0/1
values enumerated by TDX module. Configurable CPUID bits derive from TDX
module plus applying KVM's capabilities (KVM_GET_SUPPORTED_CPUID),
i.e., mask off the bits that are configurable in the view of TDX module
but not supported by KVM yet.

KVM_TDX_CPUID_NO_SUBLEAF is the concept from TDX module, switch it to 0
and use KVM_CPUID_FLAG_SIGNIFCANT_INDEX, which are the concept of KVM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
---
uAPI breakout v1:
  - Change setup_kvm_tdx_caps() to use the exported 'struct tdx_sysinfo'
    pointer.
  - Change how to copy 'kvm_tdx_cpuid_config' since 'struct tdx_sysinfo'
    doesn't have 'kvm_tdx_cpuid_config'.
  - Updates for uAPI changes
---
  arch/x86/include/uapi/asm/kvm.h |  2 -
  arch/x86/kvm/vmx/tdx.c          | 81 +++++++++++++++++++++++++++++++++
  2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 47caf508cca7..c9eb2e2f5559 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -952,8 +952,6 @@ struct kvm_tdx_cmd {
      __u64 hw_error;
  };
-#define KVM_TDX_CPUID_NO_SUBLEAF    ((__u32)-1)
-
  struct kvm_tdx_cpuid_config {
      __u32 leaf;
      __u32 sub_leaf;
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 90b44ebaf864..d89973e554f6 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -31,6 +31,19 @@ static void __used tdx_guest_keyid_free(int keyid)
      ida_free(&tdx_guest_keyid_pool, keyid);
  }
+#define KVM_TDX_CPUID_NO_SUBLEAF    ((__u32)-1)
+
+struct kvm_tdx_caps {
+    u64 supported_attrs;
+    u64 supported_xfam;
+
+    u16 num_cpuid_config;
+    /* This must the last member. */
+    DECLARE_FLEX_ARRAY(struct kvm_tdx_cpuid_config, cpuid_configs);
+};
+
+static struct kvm_tdx_caps *kvm_tdx_caps;
+
  static int tdx_get_capabilities(struct kvm_tdx_cmd *cmd)
  {
      const struct tdx_sysinfo_td_conf *td_conf = &tdx_sysinfo->td_conf;
@@ -131,6 +144,68 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp)
      return r;
  }
+#define KVM_SUPPORTED_TD_ATTRS (TDX_TD_ATTR_SEPT_VE_DISABLE)

Why isn't TDX_TD_ATTR_DEBUG added as well?

Because so far KVM doesn't support all the features of a DEBUG TD for userspace. e.g., KVM doesn't provide interface for userspace to read/write private memory of DEBUG TD.

But this means that you can't really run a TDX with SEPT_VE_DISABLE disabled for debugging purposes, so perhaps it might be necessary to rethink the condition allowing SEPT_VE_DISABLE to be disabled. Without the debug flag and SEPT_VE_DISABLE disabled the code refuses to start the VM, what if one wants to debug some SEPT issue by having an oops generated inside the vm ?

sept_ve_disable is allowed to be disable, i.e., set to 0.

I think there must be some misunderstanding.

There isn't, the current code is:

  201         if (!(td_attr & ATTR_SEPT_VE_DISABLE)) {
    1                 const char *msg = "TD misconfiguration: SEPT_VE_DISABLE attribute must be set.";
    2
    3                 /* Relax SEPT_VE_DISABLE check for debug TD. */
    4                 if (td_attr & ATTR_DEBUG)
    5                         pr_warn("%s\n", msg);
    6                 else
    7                         tdx_panic(msg);
    8         }


I.e if we disable SEPT_VE_DISABLE without having ATTR_DEBUG it results in a panic.

I see now.

It's linux TD guest's implementation, which requires SEPT_VE_DISABLE must be set unless it's a debug TD.

Yes, it can be the motivation to request KVM to add the support of ATTRIBUTES.DEBUG. But the support of ATTRIBUTES.DEBUG is not just allowing this bit to be set to 1. For DEBUG TD, VMM is allowed to read/write the private memory content, cpu registers, and MSRs, VMM is allowed to trap the exceptions in TD, VMM is allowed to manipulate the VMCS of TD vcpu, etc.

IMHO, for upstream, no need to support all the debug capability as described above. But we need firstly define a subset of them as the starter of supporting ATTRIBUTES.DEBUG. Otherwise, what is the meaning of KVM to allow the DEBUG to be set without providing any debug capability?

For debugging purpose, you can just hack guest kernel to allow spet_ve_disable to be 0 without DEBUG bit set, or hack KVM to allow DEBUG bit to be set.



<snip>