Re: KVM default behavior change on module loading in kernel 6.12
From: Sean Christopherson
Date: Mon Oct 07 2024 - 14:07:27 EST
+lists and Paolo
On Mon, Oct 07, 2024, Vadim Galitsin wrote:
> Hi Sean,
>
> My name is Vadim. I am from Oracle's VirtualBox team.
>
> I noticed your commit b4886fab6fb6 (KVM: Add a module param to allow enabling
> virtualization when KVM is loaded) which is a part of 6.12-rc1 (and newer)
> kernel.
>
> The issue I am observing on VBox side is that no VBox VMs can now be started
> by default. Historically, Qemu and VBox VMs cannot run in parallel (either of
> them should enable virtualization by its own). Previously, when
> virtualization was enabled at the event when Qemu VM starts, there was no
> such issue. I suspect VMware guys might have exactly the same problem now.
>
> Commit has absolute sense for server virtualization and of course, feature
> can be disabled by specifying "kvm.enable_virt_at_load=0" in kernel command
> line (or by unloading kvmXXX module(s) manually), but it is probably rather
> inconvenient for desktop virtualization users who run other than Qemu VMs I
> think.
>
> Would you consider to change the default behavior by having
> "kvm.enable_virt_at_load=0", so people who really need it, could explicitly
> enable it in kernel command line?
I'm not dead set against it, but my preference would be to force out-of-tree
hypervisor modules to adjust. Leaving enable_virt_at_load off by default risks
performance regressions due to the CPU hotplug framework serially operating on
CPUs[1]. And, no offence to VirtualBox or VMware, I care much more about not
regressing KVM users than I care about inconveniencing out-of-tree hypervisors.
Long term, the right answer to this problem is to move virtualization enabling
to a separate module (*very* roughly sketeched out here[2]), which would allow
out-of-tree hypervisor modules to co-exist with KVM. They would obviously need
to give up control of CR4.VMXE/VMXON/EFER.SVME, but I don't think that's an
unreasonable ask.
The multi-KVM idea aside, TDX support for trusted devices is coming down the pipe
and will need to enable VMX without KVM being involved in order to perform SEAMCALLs
from other subsystems. I.e. sooner or later, I expect virtualization enabling to
be moved out of KVM.
Short term, one idea would be to have VirtualBox's module (and others) prepare
for that future by pinning kvm-{amd,intel}.ko, and then playing nice if VMX/SVM
is already enabled.
[1] https://lore.kernel.org/all/20240608000639.3295768-9-seanjc@xxxxxxxxxx
[2] https://lore.kernel.org/all/20231107202002.667900-14-aghulati@xxxxxxxxxx