Re: [PATCH v2 3/6] KVM: Add a module param to allow enabling virtualization when KVM is loaded

From: Huang, Kai
Date: Wed May 29 2024 - 18:45:35 EST


On Wed, 2024-05-29 at 08:01 -0700, Sean Christopherson wrote:
> On Tue, May 28, 2024, Kai Huang wrote:
> > On 24/05/2024 2:39 pm, Chao Gao wrote:
> > > On Fri, May 24, 2024 at 11:11:37AM +1200, Huang, Kai wrote:
> > > >
> > > >
> > > > On 23/05/2024 4:23 pm, Chao Gao wrote:
> > > > > On Thu, May 23, 2024 at 10:27:53AM +1200, Huang, Kai wrote:
> > > > > >
> > > > > >
> > > > > > On 22/05/2024 2:28 pm, Sean Christopherson wrote:
> > > > > > > Add an off-by-default module param, enable_virt_at_load, to let userspace
> > > > > > > force virtualization to be enabled in hardware when KVM is initialized,
> > > > > > > i.e. just before /dev/kvm is exposed to userspace. Enabling virtualization
> > > > > > > during KVM initialization allows userspace to avoid the additional latency
> > > > > > > when creating/destroying the first/last VM. Now that KVM uses the cpuhp
> > > > > > > framework to do per-CPU enabling, the latency could be non-trivial as the
> > > > > > > cpuhup bringup/teardown is serialized across CPUs, e.g. the latency could
> > > > > > > be problematic for use case that need to spin up VMs quickly.
> > > > > >
> > > > > > How about we defer this until there's a real complain that this isn't
> > > > > > acceptable? To me it doesn't sound "latency of creating the first VM"
> > > > > > matters a lot in the real CSP deployments.
> > > > >
> > > > > I suspect kselftest and kvm-unit-tests will be impacted a lot because
> > > > > hundreds of tests are run serially. And it looks clumsy to reload KVM
> > > > > module to set enable_virt_at_load to make tests run faster. I think the
> > > > > test slowdown is a more realistic problem than running an off-tree
> > > > > hypervisor, so I vote to make enabling virtualization at load time the
> > > > > default behavior and if we really want to support an off-tree hypervisor,
> > > > > we can add a new module param to opt in enabling virtualization at runtime.
>
> I definitely don't object to making it the default behavior, though I would do so
> in a separate patch, e.g. in case enabling virtualization by default somehow
> causes problems.
>
> We could also add a Kconfig to control the default behavior, though IMO that'd be
> overkill without an actual use case for having virtualization off by default.
>
> > > > I am not following why off-tree hypervisor is ever related to this.
> > >
> > > Enabling virtualization at runtime was added to support an off-tree hypervisor
> > > (see the commit below).
> >
> > Oh, ok. I was thinking something else.
> >
> > > >
> > > > The problem of enabling virt during module loading by default is it impacts
> > > > all ARCHs.
>
> Pratically speaking, Intel is the only vendor where enabling virtualization is
> interesting enough for anyone to care. On SVM and all other architectures,
> enabling virtualization doesn't meaningfully change the functionality of the
> current mode.
>
> And impacting all architectures isn't a bad thing. Yes, it requires getting buy-in
> from more people, and maybe additional testing, though in this case we should get
> that for "free" by virtue of being in linux-next. But those are one-time costs,
> and not particular high costs.
>
> > > > Given this performance downgrade (if we care) can be resolved by
> > > > explicitly doing on_each_cpu() below, I am not sure why we want to choose
> > > > this radical approach.
>
> Because it's not radical? And manually doing on_each_cpu() requires complexity
> that we would ideally avoid.
>
> 15+ years ago, when VMX and SVM were nascent technologies, there was likely a
> good argument from a security perspective for leaving virtualization disabled.
> E.g. the ucode flows were new _and_ massive, and thus a potentially juicy attack
> surface.
>
> But these days, AFAIK enabling virtualization is not considered to be a security
> risk, nor are there performance or stability downsides. E.g. it's not all that
> different than the kernel enabling CR4.PKE even though it's entirely possible
> userspace will never actually use protection keys.

Agree this is not a security risk. If all other ARCHs are fine to enable
on module loading then I don't see any real problem.

>
> > > IIUC, we plan to set up TDX module at KVM load time; we need to enable virt
> > > at load time at least for TDX. Definitely, on_each_cpu() can solve the perf
> > > concern. But a solution which can also satisfy TDX's need is better to me.
> > >
> >
> > Doing on_each_cpu() explicitly can also meet TDX's purpose. We just
> > explicitly enable virtualization during module loading if we are going to
> > enable TDX. For all other cases, the behaivour remains the same, unless
> > they want to change when to enable virtualization, e.g., when loading module
> > by default.
> >
> > For always, or by default enabling virtualization during module loading, we
> > somehow discussed before:
> >
> > https://lore.kernel.org/kvm/ZiKoqMk-wZKdiar9@xxxxxxxxxx/
> >
> > My true comment is introducing a module parameter, which is a userspace ABI,
>
> Module params aren't strictly ABI, and even if they were, this would only be
> problematic if we wanted to remove the module param *and* doing so was a breaking
> change.  
>

Yes. The concern is once introduced I don't think we can easily remove it
even it becomes useless.


> Enabling virtualization should be entirely transparent to userspace,
> at least from a functional perspective; if changing how KVM enables virtualization
> breaks userspace then we likely have bigger problems.

I am not sure how should I interpret this?

"having a module param" doesn't necessarily mean "entirely transparent to
userspace", right? :-)

>
> > to just fix some performance downgrade seems overkill when it can be
> > mitigated by the kernel.
>
> Performance is secondary for me, the primary motivation is simplifying the overall
> KVM code base. Yes, we _could_ use on_each_cpu() and enable virtualization
> on-demand for TDX, but as above, it's extra complexity without any meaningful
> benefit, at least AFAICT.

Either way works for me.

I just think using a module param to resolve some problem while there can
be solution completely in the kernel seems overkill :-)