Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module

From: Sean Christopherson
Date: Thu Apr 11 2024 - 10:05:49 EST


On Thu, Apr 11, 2024, Kai Huang wrote:
> On 11/04/2024 3:29 am, Sean Christopherson wrote:
> > On Wed, Apr 10, 2024, Kai Huang wrote:
> > > > > What happens if any CPU goes online *BETWEEN* tdx_hardware_setup() and
> > > > > kvm_init()?
> > > > >
> > > > > Looks we have two options:
> > > > >
> > > > > 1) move registering CPU hotplug callback before tdx_hardware_setup(), or
> > > > > 2) we need to disable CPU hotplug until callbacks have been registered.
> >
> > This is all so dumb (not TDX, the current state of KVM). All of the hardware
> > enabling crud is pointless complex inherited from misguided, decade old paranoia
> > that led to the decision to enable VMX if and only if VMs are running. Enabling
> > VMX doesn't make the system less secure, and the insane dances we are doing to
> > do VMXON on-demand makes everything *more* fragile.
> >
> > And all of this complexity really was driven by VMX, enabling virtualization for
> > every other vendor, including AMD/SVM, is completely uninteresting. Forcing other
> > architectures/vendors to take on yet more complexity doesn't make any sense.
>
> Ah, I actually preferred this solution, but I was trying to follow your
> suggestion here:
>
> https://lore.kernel.org/lkml/ZW6FRBnOwYV-UCkY@xxxxxxxxxx/
>
> form which I interpreted you didn't like always having VMX enabled when KVM
> is present. :-)

I had a feeling I said something along those lines in the past.

> > Barely tested, and other architectures would need to be converted, but I don't
> > see any obvious reasons why we can't simply enable virtualization when the module
> > is loaded.
> >
> > The diffstat pretty much says it all.
>
> Thanks a lot for the code!
>
> I can certainly follow up with this and generate a reviewable patchset if I
> can confirm with you that this is what you want?

Yes, I think it's the right direction. I still have minor concerns about VMX
being enabled while kvm.ko is loaded, which means that VMXON will _always_ be
enabled if KVM is built-in. But after seeing the complexity that is needed to
safely initialize TDX, and after seeing just how much complexity KVM already
has because it enables VMX on-demand (I hadn't actually tried removing that code
before), I think the cost of that complexity far outweighs the risk of "always"
being post-VMXON.

Within reason, I recommend getting feedback from others before you spend _too_
much time on this. It's entirely possible I'm missing/forgetting some other angle.