Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module

From: Huang, Kai
Date: Thu Apr 25 2024 - 18:07:48 EST


On Thu, 2024-04-25 at 09:35 -0700, Sean Christopherson wrote:
> On Tue, Apr 23, 2024, Kai Huang wrote:
> > On Tue, 2024-04-23 at 08:15 -0700, Sean Christopherson wrote:
> > > Presumably that approach relies on something blocking onlining CPUs when TDX is
> > > active. And if that's not the case, the proposed patches are buggy.
> >
> > The current patch ([PATCH 023/130] KVM: TDX: Initialize the TDX module
> > when loading the KVM intel kernel module) indeed is buggy, but I don't
> > quite follow why we need to block onlining CPU when TDX is active?
>
> I was saying that based on my reading of the code, either (a) the code is buggy
> or (b) something blocks onlining CPUs when TDX is active. Sounds like the answer
> is (a).

Yeah it's a).

>
> > There's no hard things that prevent us to do so. KVM just need to do
> > VMXON + tdx_cpu_enable() inside kvm_online_cpu().
> >
> > >
> > > > Btw, the ideal (or probably the final) plan is to handle tdx_cpu_enable()
> > > > in TDX's own CPU hotplug callback in the core-kernel and hide it from all
> > > > other in-kernel TDX users.  
> > > >
> > > > Specifically:
> > > >
> > > > 1) that callback, e.g., tdx_online_cpu() will be placed _before_ any in-
> > > > kernel TDX users like KVM's callback.
> > > > 2) In tdx_online_cpu(), we do VMXON + tdx_cpu_enable() + VMXOFF, and
> > > > return error in case of any error to prevent that cpu from going online.
> > > >
> > > > That makes sure that, if TDX is supported by the platform, we basically
> > > > guarantees all online CPUs are ready to issue SEAMCALL (of course, the in-
> > > > kernel TDX user still needs to do VMXON for it, but that's TDX user's
> > > > responsibility).
> > > >
> > > > But that obviously needs to move VMXON to the core-kernel.
> > >
> > > It doesn't strictly have to be core kernel per se, just in code that sits below
> > > KVM, e.g. in a seperate module called VAC[*] ;-)
> > >
> > > [*] https://lore.kernel.org/all/ZW6FRBnOwYV-UCkY@xxxxxxxxxx
> >
> > Could you elaborate why vac.ko is necessary?
> >
> > Being a module natually we will need to handle module init and exit. But
> > TDX cannot be disabled and re-enabled after initialization, so in general
> > the vac.ko doesn't quite fit for TDX.
> >
> > And I am not sure what's the fundamental difference between managing TDX
> > module in a module vs in the core-kernel from KVM's perspective.
>
> VAC isn't strictly necessary. What I was saying is that it's not strictly
> necessary for the core kernel to handle VMXON either. I.e. it could be done in
> something like VAC, or it could be done in the core kernel.

Right, but so far I cannot see any advantage of using a VAC module,
perhaps I am missing something although.

>
> The important thing is that they're handled by _one_ entity. What we have today
> is probably the worst setup; VMXON is handled by KVM, but TDX.SYS.LP.INIT is
> handled by core kernel (sort of).

I cannot argue against this :-)

But from this point of view, I cannot see difference between tdx_enable()
and tdx_cpu_enable(), because they both in core-kernel while depend on KVM
to handle VMXON.

Or, do you prefer to we move VMXON to the core-kernel at this stage, i.e.,
as a prepare work for KVM TDX?