Juergen Gross <jgross@xxxxxxxx> writes:
On 14.07.21 13:15, Vitaly Kuznetsov wrote:
Juergen Gross <jgross@xxxxxxxx> writes:
Today the maximum number of vcpus of a kvm guest is set via a #define
in a header file.
In order to support higher vcpu numbers for guests without generally
increasing the memory consumption of guests on the host especially on
very large systems add a boot parameter for specifying the number of
allowed vcpus for guests.
The default will still be the current setting of 288. The value 0 has
the special meaning to limit the number of possible vcpus to the
number of possible cpus of the host.
Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---
Documentation/admin-guide/kernel-parameters.txt | 10 ++++++++++
arch/x86/include/asm/kvm_host.h | 5 ++++-
arch/x86/kvm/x86.c | 7 +++++++
3 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 99bfa53a2bbd..8eb856396ffa 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2373,6 +2373,16 @@
guest can't have more vcpus than the set value + 1.
Default: 1023
+ kvm.max_vcpus= [KVM,X86] Set the maximum allowed numbers of vcpus per
+ guest. The special value 0 sets the limit to the number
+ of physical cpus possible on the host (including not
+ yet hotplugged cpus). Higher values will result in
+ slightly higher memory consumption per guest. Depending
+ on the value and the virtual topology the maximum
+ allowed vcpu-id might need to be raised, too (see
+ kvm.max_vcpu_id parameter).
I'd suggest to at least add a sanity check: 'max_vcpu_id' should always
be >= 'max_vcpus'. Alternatively, we can replace 'max_vcpu_id' with say
'vcpu_id_to_vcpus_ratio' and set it to e.g. '4' by default.
Either would be fine with me.
A default of '2' for the ratio would seem more appropriate for me,
however. A thread count per core not being a power of 2 is quite
unlikely, and the worst case scenario for cores per socket would be
2^n + 1.
(I vaguely recall AMD EPYC had more than thread id (package id?)
encapsulated into APIC id).
Personally, I'd vote for introducing a 'ratio' parameter then so
generally users will only have to set 'kvm.max_vcpus'.
+ Default: 288
+
l1tf= [X86] Control mitigation of the L1TF vulnerability on
affected CPUs
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 39cbc4b6bffb..65ae82a5d444 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -37,7 +37,8 @@
#define __KVM_HAVE_ARCH_VCPU_DEBUGFS
-#define KVM_MAX_VCPUS 288
+#define KVM_DEFAULT_MAX_VCPUS 288
+#define KVM_MAX_VCPUS max_vcpus
#define KVM_SOFT_MAX_VCPUS 240
#define KVM_DEFAULT_MAX_VCPU_ID 1023
#define KVM_MAX_VCPU_ID max_vcpu_id
@@ -1509,6 +1510,8 @@ extern u64 kvm_max_tsc_scaling_ratio;
extern u64 kvm_default_tsc_scaling_ratio;
/* bus lock detection supported? */
extern bool kvm_has_bus_lock_exit;
+/* maximum number of vcpus per guest */
+extern unsigned int max_vcpus;
/* maximum vcpu-id */
extern unsigned int max_vcpu_id;
/* per cpu vcpu bitmasks (disable preemption during usage) */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a9b0bb2221ea..888c4507504d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -177,6 +177,10 @@ module_param(force_emulation_prefix, bool, S_IRUGO);
int __read_mostly pi_inject_timer = -1;
module_param(pi_inject_timer, bint, S_IRUGO | S_IWUSR);
+unsigned int __read_mostly max_vcpus = KVM_DEFAULT_MAX_VCPUS;
+module_param(max_vcpus, uint, S_IRUGO);
+EXPORT_SYMBOL_GPL(max_vcpus);
+
unsigned int __read_mostly max_vcpu_id = KVM_DEFAULT_MAX_VCPU_ID;
module_param(max_vcpu_id, uint, S_IRUGO);
@@ -10648,6 +10652,9 @@ int kvm_arch_hardware_setup(void *opaque)
if (boot_cpu_has(X86_FEATURE_XSAVES))
rdmsrl(MSR_IA32_XSS, host_xss);
+ if (max_vcpus == 0)
+ max_vcpus = num_possible_cpus();
Is this special case really needed? I mean 'max_vcpus' is not '0' by
default so whoever sets it manually probably knows how big his guests
are going to be anyway and it is not always obvious how many CPUs are
reported by 'num_possible_cpus()' (ACPI tables can be weird for example).
The idea was to make it easy for anyone managing a large fleet of hosts
and wanting to have a common setting for all of them.
I see. It seems to be uncommon indeed to run guests with more vCPUs than
host pCPUs so everything >= num_online_cpus() should be OK. My only
concern about num_possible_cpus() is that it is going to be hard to
explain what 'possible CPUs' mean (but whoever cares that much about
wasting memory can always set the required value manually).
Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature