Re: [PATCH v2 02/18] x86/reboot: Expose VMCS crash hooks if and only if KVM_INTEL is enabled

From: Huang, Kai
Date: Sun Mar 12 2023 - 20:31:59 EST


Hi Sean,

Thanks for copying me.

On Fri, 2023-03-10 at 13:42 -0800, Sean Christopherson wrote:
> Expose the crash/reboot hooks used by KVM to do VMCLEAR+VMXOFF if and
> only if there's a potential in-tree user, KVM_INTEL.
>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/include/asm/reboot.h | 2 ++
> arch/x86/kernel/reboot.c | 4 ++++
> 2 files changed, 6 insertions(+)
>
> diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
> index 2551baec927d..33c8e911e0de 100644
> --- a/arch/x86/include/asm/reboot.h
> +++ b/arch/x86/include/asm/reboot.h
> @@ -25,8 +25,10 @@ void __noreturn machine_real_restart(unsigned int type);
> #define MRR_BIOS 0
> #define MRR_APM 1
>
> +#if IS_ENABLED(CONFIG_KVM_INTEL)
> typedef void crash_vmclear_fn(void);
> extern crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss;
> +#endif
> void cpu_emergency_disable_virtualization(void);
>
> typedef void (*nmi_shootdown_cb)(int, struct pt_regs*);
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index 299b970e5f82..6c0b1634b884 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -787,6 +787,7 @@ void machine_crash_shutdown(struct pt_regs *regs)
> }
> #endif
>
> +#if IS_ENABLED(CONFIG_KVM_INTEL)
> /*
> * This is used to VMCLEAR all VMCSs loaded on the
> * processor. And when loading kvm_intel module, the
> @@ -807,6 +808,7 @@ static inline void cpu_crash_vmclear_loaded_vmcss(void)
> do_vmclear_operation();
> rcu_read_unlock();
> }
> +#endif
>
> /* This is the CPU performing the emergency shutdown work. */
> int crashing_cpu = -1;
> @@ -818,7 +820,9 @@ int crashing_cpu = -1;
> */
> void cpu_emergency_disable_virtualization(void)
> {
> +#if IS_ENABLED(CONFIG_KVM_INTEL)
> cpu_crash_vmclear_loaded_vmcss();
> +#endif
>
> cpu_emergency_vmxoff();

In the changelog you mentioned to expose the *hooks* (plural) used to do
"VMCLEAR+VMXOFF" only when KVM_INTEL is on, but here only "VMCLEAR" is embraced
with CONFIG_KVM_INTEL. So either the changelog needs improvement, or the code
should be adjusted?

Personally, I think it's better to move VMXOFF part within CONFIG_KVM_INTEL too,
if you want to do this.

But I am not sure whether we want to do this (having CONFIG_KVM_INTEL around the
relevant code). In later patches, you mentioned the case of out-of-tree
hypervisor, for instance, below in the changelog of patch 04:

There's no need to attempt VMXOFF if KVM (or some other out-of-tree 
hypervisor) isn't loaded/active...

This means we want to do handle VMCLEAR+VMXOFF in case of out-of-tree hypervisor
too. So, shouldn't the hooks always exist but not only available when KVM_INTEL
or KVM_AMD is on, so the out-of-tree hypervisor can register their callbacks?


> cpu_emergency_svm_disable();
> --
> 2.40.0.rc1.284.g88254d51c5-goog
>