Re: [PATCH v13 016/113] KVM: TDX: x86: Add ioctl to get TDX systemwide parameters

From: Zhi Wang
Date: Fri Mar 31 2023 - 08:47:16 EST


On Thu, 30 Mar 2023 17:18:03 -0700
Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote:

> On Wed, Mar 29, 2023 at 04:17:22PM -0700,
> Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote:
>
> > On Sat, Mar 25, 2023 at 10:43:06AM +0200,
> > Zhi Wang <zhi.wang.linux@xxxxxxxxx> wrote:
> >
> > > On Sun, 12 Mar 2023 10:55:40 -0700
> > > isaku.yamahata@xxxxxxxxx wrote:
> > >
> > > Does this have to be a new generic ioctl with a dedicated new x86_ops? SNP
> > > does not use it at all and all the system-scoped ioctl of SNP going through
> > > the CCP driver. So getting system-scope information of TDX/SNP will end up
> > > differently.
> > >
> > > Any thought, Sean? Moving getting SNP system-wide information to
> > > KVM dev ioctl seems not ideal and TDX does not have a dedicated driver like
> > > CCP. Maybe make this ioctl TDX-specific? KVM_TDX_DEV_OP?
> >
> > We only need global parameters of the TDX module, and we don't interact with TDX
> > module at this point. One alternative is to export those parameters via sysfs.
> > Also the existence of the sysfs node indicates that the TDX module is
> > loaded(initialized?) or not in addition to boot log. Thus we can drop system
> > scope one.
> > What do you think?
> >

I like this idea and the patch below, it feels right for me now. It would be nice
if more folks can chime in and comment.

> > Regarding to other TDX KVM specific ioctls (KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU,
> > KVM_TDX_INIT_MEM_REGION, and KVM_TDX_FINALIZE_VM), they are specific to KVM. So
> > I don't think it can be split out to independent driver.
>

They can stay in KVM as they are KVM-specific. SNP also has KVM-specific ioctls
which wraps the SEV driver calls. At this level, both TDX and SNP go their specific
implementation without more abstraction other than KVM_ENCRYPT_MEMORY_OP. Their
strategies are aligned.

The problem of the previous approach was the abstraction that no other implementation
is using it. It is like, TDX wants a higher abstraction to cover both TDX and SNP,
but SNP is not using it, which makes the abstraction looks strange.

> Here is the patch to export those info via sysfs.
>
> From e0744e506eb92e47d8317e489945a3ba804edfa7 Mon Sep 17 00:00:00 2001
> Message-Id: <e0744e506eb92e47d8317e489945a3ba804edfa7.1680221730.git.isaku.yamahata@xxxxxxxxx>
> In-Reply-To: <8e0bc0e8e5d435f54f10c7642a862629ef2acb89.1680221729.git.isaku.yamahata@xxxxxxxxx>
> References: <8e0bc0e8e5d435f54f10c7642a862629ef2acb89.1680221729.git.isaku.yamahata@xxxxxxxxx>
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> Date: Thu, 30 Mar 2023 00:05:03 -0700
> Subject: [PATCH] x86/virt/tdx: Export TD config params of TDX module via sysfs
>
> TDX module has parameters for VMM to configure TD. User space VMM, e.g.
> qemu, needs to know it. Export them to user space via sysfs.
>
> TDX 1.0 provides TDH.SYS.INFO to provide system information in
> TDSYSINFO_STRUCT. Its future extensibility is limited because of its
> struct. From TDX 1.5, TDH.SYS.RD(metadata field_id) to read the info
> specified by field id. So instead of exporting TDSYSINFO_STRUCT, adapt
> metadata way to export those system information.
>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> ---
> Documentation/ABI/testing/sysfs-firmware-tdx | 23 +++
> arch/x86/include/asm/tdx.h | 33 ++++
> arch/x86/virt/vmx/tdx/tdx.c | 164 +++++++++++++++++++
> arch/x86/virt/vmx/tdx/tdx.h | 18 ++
> 4 files changed, 238 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-firmware-tdx
>
> diff --git a/Documentation/ABI/testing/sysfs-firmware-tdx b/Documentation/ABI/testing/sysfs-firmware-tdx
> new file mode 100644
> index 000000000000..1f26fb178144
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-firmware-tdx
> @@ -0,0 +1,23 @@
> +What: /sys/firmware/tdx/tdx_module/metadata
> +Date: March 2023
> +KernelVersion: 6.3
> +Contact: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>, kvm@xxxxxxxxxxxxxxx
> +Users: qemu, libvirt
> +Description:
> + The TDX feature requires a firmware that is known as the TDX
> + module. The TDX module exposes its metadata in the following
> + read-only files. The information corresponds to the TDX global
> + metadata specified by 64bit field id. The file name is hex
> + string in lower case. The value is binary.
> + User space VMM like qemu needs refer to them to determine what
> + parameters are needed or allowed to configure guest TDs.
> +
> + ================== ============================================
> + 1900000300000000 ATTRIBUTES_FIXED0
> + 1900000300000001 ATTRIBUTES_FIXED1
> + 1900000300000002 XFAM_FIXED0
> + 1900000300000003 XFAM_FIXED1
> + 9900000100000004 NUM_CPUID_CONFIG
> + 9900000300000400 CPUID_LEAVES
> + 9900000300000500 CPUID_VALUES
> + ================== ============================================
> \ No newline at end of file
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 05870e5ed131..c650ac22a916 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -110,6 +110,39 @@ struct tdx_cpuid_config {
> u32 edx;
> } __packed;
>
> +struct tdx_cpuid_config_leaf {
> + u32 leaf;
> + u32 sub_leaf;
> +} __packed;
> +static_assert(offsetof(struct tdx_cpuid_config, leaf) ==
> + offsetof(struct tdx_cpuid_config_leaf, leaf));
> +static_assert(offsetof(struct tdx_cpuid_config, sub_leaf) ==
> + offsetof(struct tdx_cpuid_config_leaf, sub_leaf));
> +static_assert(offsetofend(struct tdx_cpuid_config, sub_leaf) ==
> + sizeof(struct tdx_cpuid_config_leaf));
> +
> +struct tdx_cpuid_config_value {
> + u32 eax;
> + u32 ebx;
> + u32 ecx;
> + u32 edx;
> +} __packed;
> +static_assert(offsetof(struct tdx_cpuid_config, eax) -
> + offsetof(struct tdx_cpuid_config, eax) ==
> + offsetof(struct tdx_cpuid_config_value, eax));
> +static_assert(offsetof(struct tdx_cpuid_config, ebx) -
> + offsetof(struct tdx_cpuid_config, eax) ==
> + offsetof(struct tdx_cpuid_config_value, ebx));
> +static_assert(offsetof(struct tdx_cpuid_config, ecx) -
> + offsetof(struct tdx_cpuid_config, eax) ==
> + offsetof(struct tdx_cpuid_config_value, ecx));
> +static_assert(offsetof(struct tdx_cpuid_config, edx) -
> + offsetof(struct tdx_cpuid_config, eax) ==
> + offsetof(struct tdx_cpuid_config_value, edx));
> +static_assert(offsetofend(struct tdx_cpuid_config, edx) -
> + offsetof(struct tdx_cpuid_config, eax) ==
> + sizeof(struct tdx_cpuid_config_value));
> +
> #define TDSYSINFO_STRUCT_SIZE 1024
> #define TDSYSINFO_STRUCT_ALIGNMENT 1024
>
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index f9f9c1b76501..56ca520d67d6 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -33,6 +33,12 @@
> #include <asm/tdx.h>
> #include "tdx.h"
>
> +#ifdef CONFIG_SYSFS
> +static int tdx_sysfs_init(void);
> +#else
> +static inline int tdx_sysfs_init(void) { return 0;}
> +#endif
> +
> u32 tdx_global_keyid __ro_after_init;
> EXPORT_SYMBOL_GPL(tdx_global_keyid);
> static u32 tdx_guest_keyid_start __ro_after_init;
> @@ -399,6 +405,10 @@ static int __tdx_get_sysinfo(struct tdsysinfo_struct *sysinfo,
> if (ret)
> return ret;
>
> + ret = tdx_sysfs_init();
> + if (ret)
> + return ret;
> +
> pr_info("TDX module: atributes 0x%x, vendor_id 0x%x, major_version %u, minor_version %u, build_date %u, build_num %u",
> sysinfo->attributes, sysinfo->vendor_id,
> sysinfo->major_version, sysinfo->minor_version,
> @@ -1367,3 +1377,157 @@ int tdx_enable(void)
> return ret;
> }
> EXPORT_SYMBOL_GPL(tdx_enable);
> +
> +#ifdef CONFIG_SYSFS
> +
> +static struct kobject *tdx_kobj;
> +static struct kobject *tdx_module_kobj;
> +static struct kobject *tdx_metadata_kobj;
> +
> +#define TDX_METADATA_ATTR(_name, field_id_name, _size) \
> +static struct bin_attribute tdx_metadata_ ## _name = { \
> + .attr = { \
> + .name = field_id_name, \
> + .mode = 0444, \
> + }, \
> + .size = _size, \
> + .read = tdx_metadata_ ## _name ## _show, \
> +}
> +
> +#define TDX_METADATA_ATTR_SHOW(_name, field_id_name) \
> +static ssize_t tdx_metadata_ ## _name ## _show(struct file *filp, struct kobject *kobj, \
> + struct bin_attribute *bin_attr, \
> + char *buf, loff_t offset, size_t count) \
> +{ \
> + struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo); \
> + \
> + return memory_read_from_buffer(buf, count, &offset, \
> + &sysinfo->_name, \
> + sizeof(sysinfo->_name)); \
> +} \
> +TDX_METADATA_ATTR(_name, field_id_name, sizeof_field(struct tdsysinfo_struct, _name))
> +
> +TDX_METADATA_ATTR_SHOW(attributes_fixed0, TDX_METADATA_ATTRIBUTES_FIXED0_NAME);
> +TDX_METADATA_ATTR_SHOW(attributes_fixed1, TDX_METADATA_ATTRIBUTES_FIXED1_NAME);
> +TDX_METADATA_ATTR_SHOW(xfam_fixed0, TDX_METADATA_XFAM_FIXED0_NAME);
> +TDX_METADATA_ATTR_SHOW(xfam_fixed1, TDX_METADATA_XFAM_FIXED1_NAME);
> +
> +static ssize_t tdx_metadata_num_cpuid_config_show(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr,
> + char *buf, loff_t offset, size_t count)
> +{
> + struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
> + /*
> + * Although tdsysinfo_struct.num_cpuid_config is defined as u32 for
> + * alignment, TDX 1.5 defines metadata NUM_CONFIG_CPUID as u16.
> + */
> + u16 tmp = (u16)sysinfo->num_cpuid_config;
> +
> + WARN_ON_ONCE(tmp != sysinfo->num_cpuid_config);
> + return memory_read_from_buffer(buf, count, &offset, &tmp, sizeof(tmp));
> +}
> +TDX_METADATA_ATTR(num_cpuid_config, TDX_METADATA_NUM_CPUID_CONFIG_NAME, sizeof(u16));
> +
> +static ssize_t tdx_metadata_cpuid_leaves_show(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr, char *buf,
> + loff_t offset, size_t count)
> +{
> + struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
> + ssize_t r;
> + struct tdx_cpuid_config_leaf *tmp;
> + u32 i;
> +
> + tmp = kmalloc(bin_attr->size, GFP_KERNEL);
> + if (!tmp)
> + return -ENOMEM;
> +
> + for (i = 0; i < sysinfo->num_cpuid_config; i++) {
> + struct tdx_cpuid_config *c = &sysinfo->cpuid_configs[i];
> + struct tdx_cpuid_config_leaf *leaf = (struct tdx_cpuid_config_leaf *)c;
> +
> + memcpy(tmp + i, leaf, sizeof(*leaf));
> + }
> +
> + r = memory_read_from_buffer(buf, count, &offset, tmp, bin_attr->size);
> + kfree(tmp);
> + return r;
> +}
> +
> +TDX_METADATA_ATTR(cpuid_leaves, TDX_METADATA_CPUID_LEAVES_NAME, 0);
> +
> +static ssize_t tdx_metadata_cpuid_values_show(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr, char *buf,
> + loff_t offset, size_t count)
> +{
> + struct tdsysinfo_struct *sysinfo = &PADDED_STRUCT(tdsysinfo);
> + struct tdx_cpuid_config_value *tmp;
> + ssize_t r;
> + u32 i;
> +
> + tmp = kmalloc(bin_attr->size, GFP_KERNEL);
> + if (!tmp)
> + return -ENOMEM;
> +
> + for (i = 0; i < sysinfo->num_cpuid_config; i++) {
> + struct tdx_cpuid_config *c = &sysinfo->cpuid_configs[i];
> + struct tdx_cpuid_config_value *value = (struct tdx_cpuid_config_value *)&c->eax;
> +
> + memcpy(tmp + i, value, sizeof(*value));
> + }
> +
> + r = memory_read_from_buffer(buf, count, &offset, tmp, bin_attr->size);
> + kfree(tmp);
> + return r;
> +}
> +
> +TDX_METADATA_ATTR(cpuid_values, TDX_METADATA_CPUID_VALUES_NAME, 0);
> +
> +static struct bin_attribute *tdx_metadata_attrs[] = {
> + &tdx_metadata_attributes_fixed0,
> + &tdx_metadata_attributes_fixed1,
> + &tdx_metadata_xfam_fixed0,
> + &tdx_metadata_xfam_fixed1,
> + &tdx_metadata_num_cpuid_config,
> + &tdx_metadata_cpuid_leaves,
> + &tdx_metadata_cpuid_values,
> + NULL,
> +};
> +
> +static const struct attribute_group tdx_metadata_attr_group = {
> + .bin_attrs = tdx_metadata_attrs,
> +};
> +
> +static int tdx_sysfs_init(void)
> +{
> + struct tdsysinfo_struct *sysinfo;
> + int ret;
> +
> + tdx_kobj = kobject_create_and_add("tdx", firmware_kobj);
> + if (!tdx_kobj) {
> + pr_err("kobject_create_and_add tdx failed\n");
> + return -EINVAL;
> + }
> +
> + tdx_module_kobj = kobject_create_and_add("tdx_module", tdx_kobj);
> + if (!tdx_module_kobj) {
> + pr_err("kobject_create_and_add tdx_module failed\n");
> + return -EINVAL;
> + }
> + tdx_metadata_kobj = kobject_create_and_add("metadata", tdx_module_kobj);
> + if (!tdx_metadata_kobj) {
> + pr_err("Sysfs exporting tdx global metadata failed %d\n", ret);
> + return -EINVAL;
> + }
> +
> + sysinfo = &PADDED_STRUCT(tdsysinfo);
> + tdx_metadata_cpuid_leaves.size = sysinfo->num_cpuid_config *
> + sizeof(struct tdx_cpuid_config_leaf);
> + tdx_metadata_cpuid_values.size = sysinfo->num_cpuid_config *
> + sizeof(struct tdx_cpuid_config_value);
> + ret = sysfs_create_group(tdx_metadata_kobj, &tdx_metadata_attr_group);
> + if (ret)
> + pr_err("Sysfs exporting tdx module attributes failed %d\n", ret);
> +
> + return ret;
> +}
> +#endif
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index db0cbcceb5b3..a48f38fe6cc4 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -93,4 +93,22 @@ struct tdmr_info_list {
> int max_tdmrs; /* How many 'tdmr_info's are allocated */
> };
>
> +/* TDX metadata base field id. */
> +#define TDX_METADATA_ATTRIBUTES_FIXED0 0x1900000300000000ULL
> +#define TDX_METADATA_ATTRIBUTES_FIXED1 0x1900000300000001ULL
> +#define TDX_METADATA_XFAM_FIXED0 0x1900000300000002ULL
> +#define TDX_METADATA_XFAM_FIXED1 0x1900000300000003ULL
> +#define TDX_METADATA_NUM_CPUID_CONFIG 0x9900000100000004ULL
> +#define TDX_METADATA_CPUID_LEAVES 0x9900000300000400ULL
> +#define TDX_METADATA_CPUID_VALUES 0x9900000300000500ULL
> +
> +/* File name for sysfs: hex with lower case. */
> +#define TDX_METADATA_ATTRIBUTES_FIXED0_NAME "1900000300000000"
> +#define TDX_METADATA_ATTRIBUTES_FIXED1_NAME "1900000300000001"
> +#define TDX_METADATA_XFAM_FIXED0_NAME "1900000300000002"
> +#define TDX_METADATA_XFAM_FIXED1_NAME "1900000300000003"
> +#define TDX_METADATA_NUM_CPUID_CONFIG_NAME "9900000100000004"
> +#define TDX_METADATA_CPUID_LEAVES_NAME "9900000300000400"
> +#define TDX_METADATA_CPUID_VALUES_NAME "9900000300000500"
> +
> #endif