Re: [PATCH v2 3/6] KVM: Add a module param to allow enabling virtualization when KVM is loaded
From: Sean Christopherson
Date: Wed May 29 2024 - 11:01:52 EST
On Tue, May 28, 2024, Kai Huang wrote:
> On 24/05/2024 2:39 pm, Chao Gao wrote:
> > On Fri, May 24, 2024 at 11:11:37AM +1200, Huang, Kai wrote:
> > >
> > >
> > > On 23/05/2024 4:23 pm, Chao Gao wrote:
> > > > On Thu, May 23, 2024 at 10:27:53AM +1200, Huang, Kai wrote:
> > > > >
> > > > >
> > > > > On 22/05/2024 2:28 pm, Sean Christopherson wrote:
> > > > > > Add an off-by-default module param, enable_virt_at_load, to let userspace
> > > > > > force virtualization to be enabled in hardware when KVM is initialized,
> > > > > > i.e. just before /dev/kvm is exposed to userspace. Enabling virtualization
> > > > > > during KVM initialization allows userspace to avoid the additional latency
> > > > > > when creating/destroying the first/last VM. Now that KVM uses the cpuhp
> > > > > > framework to do per-CPU enabling, the latency could be non-trivial as the
> > > > > > cpuhup bringup/teardown is serialized across CPUs, e.g. the latency could
> > > > > > be problematic for use case that need to spin up VMs quickly.
> > > > >
> > > > > How about we defer this until there's a real complain that this isn't
> > > > > acceptable? To me it doesn't sound "latency of creating the first VM"
> > > > > matters a lot in the real CSP deployments.
> > > >
> > > > I suspect kselftest and kvm-unit-tests will be impacted a lot because
> > > > hundreds of tests are run serially. And it looks clumsy to reload KVM
> > > > module to set enable_virt_at_load to make tests run faster. I think the
> > > > test slowdown is a more realistic problem than running an off-tree
> > > > hypervisor, so I vote to make enabling virtualization at load time the
> > > > default behavior and if we really want to support an off-tree hypervisor,
> > > > we can add a new module param to opt in enabling virtualization at runtime.
I definitely don't object to making it the default behavior, though I would do so
in a separate patch, e.g. in case enabling virtualization by default somehow
causes problems.
We could also add a Kconfig to control the default behavior, though IMO that'd be
overkill without an actual use case for having virtualization off by default.
> > > I am not following why off-tree hypervisor is ever related to this.
> >
> > Enabling virtualization at runtime was added to support an off-tree hypervisor
> > (see the commit below).
>
> Oh, ok. I was thinking something else.
>
> > >
> > > The problem of enabling virt during module loading by default is it impacts
> > > all ARCHs.
Pratically speaking, Intel is the only vendor where enabling virtualization is
interesting enough for anyone to care. On SVM and all other architectures,
enabling virtualization doesn't meaningfully change the functionality of the
current mode.
And impacting all architectures isn't a bad thing. Yes, it requires getting buy-in
from more people, and maybe additional testing, though in this case we should get
that for "free" by virtue of being in linux-next. But those are one-time costs,
and not particular high costs.
> > > Given this performance downgrade (if we care) can be resolved by
> > > explicitly doing on_each_cpu() below, I am not sure why we want to choose
> > > this radical approach.
Because it's not radical? And manually doing on_each_cpu() requires complexity
that we would ideally avoid.
15+ years ago, when VMX and SVM were nascent technologies, there was likely a
good argument from a security perspective for leaving virtualization disabled.
E.g. the ucode flows were new _and_ massive, and thus a potentially juicy attack
surface.
But these days, AFAIK enabling virtualization is not considered to be a security
risk, nor are there performance or stability downsides. E.g. it's not all that
different than the kernel enabling CR4.PKE even though it's entirely possible
userspace will never actually use protection keys.
> > IIUC, we plan to set up TDX module at KVM load time; we need to enable virt
> > at load time at least for TDX. Definitely, on_each_cpu() can solve the perf
> > concern. But a solution which can also satisfy TDX's need is better to me.
> >
>
> Doing on_each_cpu() explicitly can also meet TDX's purpose. We just
> explicitly enable virtualization during module loading if we are going to
> enable TDX. For all other cases, the behaivour remains the same, unless
> they want to change when to enable virtualization, e.g., when loading module
> by default.
>
> For always, or by default enabling virtualization during module loading, we
> somehow discussed before:
>
> https://lore.kernel.org/kvm/ZiKoqMk-wZKdiar9@xxxxxxxxxx/
>
> My true comment is introducing a module parameter, which is a userspace ABI,
Module params aren't strictly ABI, and even if they were, this would only be
problematic if we wanted to remove the module param *and* doing so was a breaking
change. Enabling virtualization should be entirely transparent to userspace,
at least from a functional perspective; if changing how KVM enables virtualization
breaks userspace then we likely have bigger problems.
> to just fix some performance downgrade seems overkill when it can be
> mitigated by the kernel.
Performance is secondary for me, the primary motivation is simplifying the overall
KVM code base. Yes, we _could_ use on_each_cpu() and enable virtualization
on-demand for TDX, but as above, it's extra complexity without any meaningful
benefit, at least AFAICT.