Re: [PATCH v2 02/18] x86/reboot: Expose VMCS crash hooks if and only if KVM_INTEL is enabled

From: Sean Christopherson
Date: Mon Mar 13 2023 - 14:33:06 EST


On Mon, Mar 13, 2023, Huang, Kai wrote:
> Hi Sean,
>
> Thanks for copying me.
>
> On Fri, 2023-03-10 at 13:42 -0800, Sean Christopherson wrote:
> > Expose the crash/reboot hooks used by KVM to do VMCLEAR+VMXOFF if and
> > only if there's a potential in-tree user, KVM_INTEL.
> >
> > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > ---

...

> > diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> > index 299b970e5f82..6c0b1634b884 100644
> > --- a/arch/x86/kernel/reboot.c
> > +++ b/arch/x86/kernel/reboot.c
> > @@ -787,6 +787,7 @@ void machine_crash_shutdown(struct pt_regs *regs)
> > }
> > #endif
> >
> > +#if IS_ENABLED(CONFIG_KVM_INTEL)
> > /*
> > * This is used to VMCLEAR all VMCSs loaded on the
> > * processor. And when loading kvm_intel module, the
> > @@ -807,6 +808,7 @@ static inline void cpu_crash_vmclear_loaded_vmcss(void)
> > do_vmclear_operation();
> > rcu_read_unlock();
> > }
> > +#endif
> >
> > /* This is the CPU performing the emergency shutdown work. */
> > int crashing_cpu = -1;
> > @@ -818,7 +820,9 @@ int crashing_cpu = -1;
> > */
> > void cpu_emergency_disable_virtualization(void)
> > {
> > +#if IS_ENABLED(CONFIG_KVM_INTEL)
> > cpu_crash_vmclear_loaded_vmcss();
> > +#endif
> >
> > cpu_emergency_vmxoff();
>
> In the changelog you mentioned to expose the *hooks* (plural) used to do
> "VMCLEAR+VMXOFF" only when KVM_INTEL is on, but here only "VMCLEAR" is embraced
> with CONFIG_KVM_INTEL. So either the changelog needs improvement, or the code
> should be adjusted?

I'll reword the changelog, "hooks" in my head was referring to the regsiter and
unregister "hooks", not the callback itself.

> Personally, I think it's better to move VMXOFF part within CONFIG_KVM_INTEL too,
> if you want to do this.

That happens eventually in the final third of this series.

> But I am not sure whether we want to do this (having CONFIG_KVM_INTEL around the
> relevant code). In later patches, you mentioned the case of out-of-tree
> hypervisor, for instance, below in the changelog of patch 04:
>
> There's no need to attempt VMXOFF if KVM (or some other out-of-tree�
> hypervisor) isn't loaded/active...
>
> This means we want to do handle VMCLEAR+VMXOFF in case of out-of-tree hypervisor
> too. So, shouldn't the hooks always exist but not only available when KVM_INTEL
> or KVM_AMD is on, so the out-of-tree hypervisor can register their callbacks?

Ah, I see how I confused things with that statement. My intent was only to call
out that, technically, a non-NULL callback doesn't mean KVM is loaded. I didn't
intend to sign the kernel up for going out of its way to support out-of-tree hypervisors.

Does it read better if I add a "that piggybacked the callback" qualifier?

There's no need to attempt VMXOFF if KVM (or some other out-of-tree hypervisor
that piggybacked the callback) isn't loaded/active, i.e. if the CPU can't
possibly be post-VMXON.