On Fri, May 24, 2024 at 11:11:37AM +1200, Huang, Kai wrote:
On 23/05/2024 4:23 pm, Chao Gao wrote:
On Thu, May 23, 2024 at 10:27:53AM +1200, Huang, Kai wrote:
On 22/05/2024 2:28 pm, Sean Christopherson wrote:
Add an off-by-default module param, enable_virt_at_load, to let userspace
force virtualization to be enabled in hardware when KVM is initialized,
i.e. just before /dev/kvm is exposed to userspace. Enabling virtualization
during KVM initialization allows userspace to avoid the additional latency
when creating/destroying the first/last VM. Now that KVM uses the cpuhp
framework to do per-CPU enabling, the latency could be non-trivial as the
cpuhup bringup/teardown is serialized across CPUs, e.g. the latency could
be problematic for use case that need to spin up VMs quickly.
How about we defer this until there's a real complain that this isn't
acceptable? To me it doesn't sound "latency of creating the first VM"
matters a lot in the real CSP deployments.
I suspect kselftest and kvm-unit-tests will be impacted a lot because
hundreds of tests are run serially. And it looks clumsy to reload KVM
module to set enable_virt_at_load to make tests run faster. I think the
test slowdown is a more realistic problem than running an off-tree
hypervisor, so I vote to make enabling virtualization at load time the
default behavior and if we really want to support an off-tree hypervisor,
we can add a new module param to opt in enabling virtualization at runtime.
I am not following why off-tree hypervisor is ever related to this.
Enabling virtualization at runtime was added to support an off-tree hypervisor
(see the commit below).
The problem of enabling virt during module loading by default is it impacts
all ARCHs. Given this performance downgrade (if we care) can be resolved by
explicitly doing on_each_cpu() below, I am not sure why we want to choose
this radical approach.
IIUC, we plan to set up TDX module at KVM load time; we need to enable virt
at load time at least for TDX. Definitely, on_each_cpu() can solve the perf
concern. But a solution which can also satisfy TDX's need is better to me.