Re: [PATCH v2 2/6] x86/kvm: add boot parameter for adding vcpu-id bits

From: Eduardo Habkost
Date: Fri Sep 03 2021 - 15:48:35 EST


On Fri, Sep 03, 2021 at 03:08:03PM +0200, Juergen Gross wrote:
> Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set
> via a #define in a header file.
>
> In order to support higher vcpu-ids without generally increasing the
> memory consumption of guests on the host (some guest structures contain
> arrays sized by KVM_MAX_VCPU_ID) add a boot parameter for adding some
> bits to the vcpu-id. Additional bits are needed as the vcpu-id is
> constructed via bit-wise concatenation of socket-id, core-id, etc.
> As those ids maximum values are not always a power of 2, the vcpu-ids
> are sparse.
>
> The additional number of bits needed is basically the number of
> topology levels with a non-power-of-2 maximum value, excluding the top
> most level.
>
> The default value of the new parameter will be to take the correct
> setting from the host's topology.

Having the default depend on the host topology makes the host
behaviour unpredictable (which might be a problem when migrating
VMs from another host with a different topology). Can't we just
default to 2?

>
> Calculating the maximum vcpu-id dynamically requires to allocate the
> arrays using KVM_MAX_VCPU_ID as the size dynamically.
>
> Signed-of-by: Juergen Gross <jgross@xxxxxxxx>
> ---
> V2:
> - switch to specifying additional bits (based on comment by Vitaly
> Kuznetsov)
>
> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
> ---
[...]
> #define KVM_MAX_VCPUS 288
> #define KVM_SOFT_MAX_VCPUS 240
> -#define KVM_MAX_VCPU_ID 1023
> +#define KVM_MAX_VCPU_ID kvm_max_vcpu_id()
[...]
> +unsigned int kvm_max_vcpu_id(void)
> +{
> + int n_bits = fls(KVM_MAX_VCPUS - 1);
> +
> + if (vcpu_id_add_bits < -1 || vcpu_id_add_bits > (32 - n_bits)) {
> + pr_err("Invalid value of vcpu_id_add_bits=%d parameter!\n",
> + vcpu_id_add_bits);
> + vcpu_id_add_bits = -1;
> + }
> +
> + if (vcpu_id_add_bits >= 0) {
> + n_bits += vcpu_id_add_bits;
> + } else {
> + n_bits++; /* One additional bit for core level. */
> + if (topology_max_die_per_package() > 1)
> + n_bits++; /* One additional bit for die level. */
> + }
> +
> + if (!n_bits)
> + n_bits = 1;
> +
> + return (1U << n_bits) - 1;

The largest possible VCPU ID is not KVM_MAX_VCPU_ID,
it's (KVM_MAX_VCPU_ID - 1). This is enforced by
kvm_vm_ioctl_create_vcpu().

That would mean KVM_MAX_VCPU_ID should be (1 << n_bits) instead
of ((1 << n_bits) - 1), wouldn't it?


> +}
> +EXPORT_SYMBOL_GPL(kvm_max_vcpu_id);
> +
> /*
> * Restoring the host value for MSRs that are only consumed when running in
> * usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU
> --
> 2.26.2
>

--
Eduardo