Re: [PATCH v12 07/22] x86/virt/tdx: Add skeleton to enable TDX on demand
From: Peter Zijlstra
Date: Mon Jul 03 2023 - 06:50:05 EST
On Fri, Jun 30, 2023 at 02:24:56PM -0700, Sean Christopherson wrote:
> I dunno about that, *totally* killing TDX would make my life a lot simpler ;-)
:-)
> > > I don't get this obsession with doing at module load time :/
>
> Waiting until userspace attempts to create the first TDX guest adds complexity
> and limits what KVM can do to harden itself. Currently, all feature support in
> KVM is effectively frozen at module load. E.g. most of the setup code is
> contained in __init functions, many module-scoped variables are effectively
> RO after init (though they can't be marked as such until we smush kvm-intel.ko
> and kvm-amd.ko into kvm.ko, which is tentatively the long-term plan). All of
> those patterns would get tossed aside if KVM waits until userspace attempts to
> create the first guest.
Pff, all that is perfectly possible, just a wee bit more work :-) I
mean, we manage to poke text that's RO, surely we can poke a variable
that supposedly RO.
And I really wish we could put part of the kvm-intel/amd.ko things in
the kernel proper and reduce the EXPORT_SYMBOL surface -- we're
exporting a whole bunch of things that really shouldn't be, just for KVM
:/
> The userspace experience would also be poor, as KVM can't know whether or TDX is
> actually supported until the TDX module is fully loaded and configured.
Quality that :-(
> There are also latency and noisy neighbor concerns, e.g. we *really* don't want
> to end up in a situation where creating a TDX guest for a customer can observe
> arbitrary latency *and* potentially be disruptive to VMs already running on the
> host.
Well, that's a quality of implementation issue with the whole TDX
crapola. Sounds like we want to impose latency constraints on the
various TDX calls. Allowing it to consume arbitrary amounts of CPU time
is unacceptable in any case.
> Userspace can workaround the second and third issues by spawning a dummy TDX guest
> as early as possible, but that adds complexity to userspace, especially if there's
> any desire for it to be race free, e.g. with respect to reporting system capabilities
> to the control plan.
FWIW, I'm 100% behind pushing complexity into userspace if it makes for
a simpler kernel.
> On the flip side, limited hardware availability (unless Intel has changed its
> tune) and the amount of enabling that's required in BIOS and whatnot makes it
> highly unlikely that random Linux users are going to unknowingly boot with TDX
> enabled.
>
> That said, if this is a sticking point, let's just make enable_tdx off by default,
OK.