Re: [PATCH v3 4/4] x86/kvm: add boot parameter for setting max number of vcpus per guest

From: Juergen Gross
Date: Thu Nov 18 2021 - 10:15:39 EST


On 18.11.21 16:05, Sean Christopherson wrote:
On Thu, Nov 18, 2021, Juergen Gross wrote:
On 17.11.21 21:57, Sean Christopherson wrote:
Rather than makes this a module param, I would prefer to start with the below
patch (originally from TDX pre-enabling) and then wire up a way for userspace to
_lower_ the max on a per-VM basis, e.g. add a capability.

The main reason for this whole series is a request by a partner
to enable huge VMs on huge machines (huge meaning thousands of
vcpus on thousands of physical cpus).

Making this large number a compile time setting would hurt all
the users who have more standard requirements by allocating the
needed resources even on small systems, so I've switched to a boot
parameter in order to enable those huge numbers only when required.

With Marc's series to use an xarray for the vcpu pointers only the
bitmaps for sending IRQs to vcpus are left which need to be sized
according to the max vcpu limit. Your patch below seems to be fine, but
doesn't help for that case.

Ah, you want to let userspace define a MAX_VCPUS that goes well beyond the current
limit without negatively impacting existing setups. My idea of a per-VM capability

Correct.

still works, it would simply require separating the default max from the absolute
max, which this patch mostly does already, it just neglects to set an absolute max.

Which is a good segue into pointing out that if a module param is added, it needs
to be sanity checked against a KVM-defined max. The admin may be trusted to some
extent, but there is zero reason to let userspace set max_vcspus to 4 billion.
At that point, it really is just a param vs. capability question.

I agree. Capping it at e.g. 65536 would probably be a good idea.

I like the idea of a capability because there are already two known use cases,
arm64's GIC and x86's TDX, and it could also be used to reduce the kernel's footprint
for use cases that run large numbers of smaller VMs.

The other alternative would be to turn KVM_MAX_VCPUS into a Kconfig knob. I assume

I like combining the capping and a Kconfig knob. So let the distro (or
whoever is building the kernel) decide, which is the max allowed value
(e.g. above 65536 per default).

the partner isn't running a vanilla distro build and could set it as they see fit.

And here you are wrong. They'd like to use standard SUSE Linux (SLE).


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature