On Thu, Apr 11, 2024, Kai Huang wrote:
On 11/04/2024 3:29 am, Sean Christopherson wrote:
On Wed, Apr 10, 2024, Kai Huang wrote:
What happens if any CPU goes online *BETWEEN* tdx_hardware_setup() and
kvm_init()?
Looks we have two options:
1) move registering CPU hotplug callback before tdx_hardware_setup(), or
2) we need to disable CPU hotplug until callbacks have been registered.
This is all so dumb (not TDX, the current state of KVM). All of the hardware
enabling crud is pointless complex inherited from misguided, decade old paranoia
that led to the decision to enable VMX if and only if VMs are running. Enabling
VMX doesn't make the system less secure, and the insane dances we are doing to
do VMXON on-demand makes everything *more* fragile.
And all of this complexity really was driven by VMX, enabling virtualization for
every other vendor, including AMD/SVM, is completely uninteresting. Forcing other
architectures/vendors to take on yet more complexity doesn't make any sense.
Ah, I actually preferred this solution, but I was trying to follow your
suggestion here:
https://lore.kernel.org/lkml/ZW6FRBnOwYV-UCkY@xxxxxxxxxx/
form which I interpreted you didn't like always having VMX enabled when KVM
is present. :-)
I had a feeling I said something along those lines in the past.
Barely tested, and other architectures would need to be converted, but I don't
see any obvious reasons why we can't simply enable virtualization when the module
is loaded.
The diffstat pretty much says it all.
Thanks a lot for the code!
I can certainly follow up with this and generate a reviewable patchset if I
can confirm with you that this is what you want?
Yes, I think it's the right direction. I still have minor concerns about VMX
being enabled while kvm.ko is loaded, which means that VMXON will _always_ be
enabled if KVM is built-in. But after seeing the complexity that is needed to
safely initialize TDX, and after seeing just how much complexity KVM already
has because it enables VMX on-demand (I hadn't actually tried removing that code
before), I think the cost of that complexity far outweighs the risk of "always"
being post-VMXON.
Within reason, I recommend getting feedback from others before you spend _too_
much time on this. It's entirely possible I'm missing/forgetting some other angle.