On Wed, Nov 06, 2024, Kai Huang wrote:
On Wed, 2024-11-06 at 07:01 -0800, Sean Christopherson wrote:
On Wed, Nov 06, 2024, Kai Huang wrote:
For this we only disable TDX but continue to load module. The reason is I think
this is similar to enable a specific KVM feature but the hardware doesn't
support it. We can go further to check the return value of tdx_cpu_enable() to
distinguish cases like "module not loaded" and "unexpected error", but I really
don't want to go that far.
Hrm, tdx_cpu_enable() is a bit of a mess. Ideally, there would be a separate
"probe" API so that KVM could detect if TDX is supported. Though maybe it's the
TDX module itself is flawed, e.g. if TDH_SYS_INIT is literally the only way to
detect whether or not a module is loaded.
We can also use P-SEAMLDR SEAMCALL to query, but I see no difference between
using TDH_SYS_INIT. If you are asking whether there's CPUID or MSR to query
then no.
Doesn't have to be a CPUID or MSR, anything idempotent would work. Which begs
the question, is that P-SEAMLDR SEAMCALL query you have in mind idempotent? :-)
So, absent a way to clean up tdx_cpu_enable(), maybe disable the module param if
it returns -ENODEV, otherwise fail the module load?
We can, but we need to assume cpuhp_setup_state_cpuslocked() itself will not
return -ENODEV (it is true now), otherwise we won't be able to distinguish
whether the -ENODEV was from cpuhp_setup_state_cpuslocked() or tdx_cpu_enable().
Unless we choose to do tdx_cpu_enable() via on_each_cpu() separately.
Btw tdx_cpu_enable() itself will print "module not loaded" in case of -ENODEV,
so the user will be aware anyway if we only disable TDX but not fail module
loading.
That only helps if a human looks at dmesg before attempting to run a TDX VM on
the host, and parsing dmesg to treat that particular scenario as fatal isn't
something I want to recommend to end users. E.g. if our platform configuration
screwed up and failed to load a TDX module, then I want that to be surfaced as
an alarm of sorts, not a silent "this platform doesn't support TDX" flag.
My concern is still the whole "different handling of error cases" seems over-
engineering.
IMO, that's a symptom of the TDX enabling code not cleanly separating "probe"
from "enable", and at a glance, that seems very solvable.
And I suspect that
cleaning things up will allow for additional hardening. E.g. I assume the lack
of MOVDIR64B should be a WARN, but because KVM checks for MOVDIR64B before
checking for basic TDX support, it's an non-commitalpr_warn().
4) tdx_enable() fails.
Ditto to 3).
No, this should fail the module load. E.g. most of the error conditions are
-ENOMEM, which has nothing to do with host support for TDX.
5) tdx_get_sysinfo() fails.
This is a kernel bug since tdx_get_sysinfo() should always return valid TDX
sysinfo structure pointer after tdx_enable() is done successfully. Currently we
just WARN() if the returned pointer is NULL and disable TDX only. I think it's
also fine.
6) TDX global metadata check fails, e.g., MAX_VCPUS etc.
Ditto to 3). For this we disable TDX only.
Where is this code?
Please check:
https://github.com/intel/tdx/blob/tdx_kvm_dev-2024-10-25.1-host-metadata-v6-rebase/arch/x86/kvm/vmx/tdx.c
.. starting at line 3320.
Before I forget, that code has a bug. This
/* Check TDX module and KVM capabilities */
if (!tdx_get_supported_attrs(&tdx_sysinfo->td_conf) ||
!tdx_get_supported_xfam(&tdx_sysinfo->td_conf))
goto get_sysinfo_err;
will return '0' on error, instead of -EINVAL (or whatever it intends).
Back to the main discussion, these checks are obvious "probe" failures and so
should disable TDX without failing module load:
if (!tdp_mmu_enabled || !enable_mmio_caching)
return -EOPNOTSUPP;
if (!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)) {
pr_warn("MOVDIR64B is reqiured for TDX\n");
return -EOPNOTSUPP;
}
A kvm_find_user_return_msr() error is obviously a KVM bug, i.e. should definitely
WARN and fail module module. Ditto for kvm_enable_virtualization().
The boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM) that's buried in tdx_enable()
really belongs in KVM. Having it in both is totally fine, but KVM shouldn't do
a bunch of work and _then_ check if all that work was pointless.
I am ok treating everything at or after tdx_get_sysinfo() as fatal to module load,
especially since, IIUC, TD_SYS_INIT can't be undone, i.e. KVM has crossed a point
of no return.
In short, assuming KVM can query if a TDX module is a loaded, I don't think it's
all that much work to do:
static bool kvm_is_tdx_supported(void)
{
if (boot_cpu_has(X86_FEATURE_TDX_HOST_PLATFORM))
return false;
if (!<is TDX module loaded>)
return false;
if (!tdp_mmu_enabled || !enable_mmio_caching)
return false;
if (WARN_ON_ONCE(!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)))
return false;
return true;
}
int __init tdx_bringup(void)
{
enable_tdx = enable_tdx && kvm_is_tdx_supported();
if (!enable_tdx)
return 0;
return __tdx_bringup();
}