Q. about KVM and CPU hotplug

From: Tian, Kevin
Date: Tue Nov 30 2021 - 03:27:16 EST


Hi, Paolo/Thomas,

I'm curious about the consequence if KVM fails to initialize a
hotplugged CPU.

Looking at the code KVM has been added to the CPU hotplug state
machine:

r = cpuhp_setup_state_nocalls(CPUHP_AP_KVM_STARTING, "kvm/cpu:starting",
kvm_starting_cpu, kvm_dying_cpu);

static int kvm_starting_cpu(unsigned int cpu)
{
raw_spin_lock(&kvm_count_lock);
if (kvm_usage_count)
hardware_enable_nolock(NULL);
raw_spin_unlock(&kvm_count_lock);
return 0;
}

kvm_starting_cpu() always return success as the callbacks in the
STARTING section are not allowed to fail.

However hardware_enable_nolock() may fail for various reasons:

static void hardware_enable_nolock(void *junk)
{
int cpu = raw_smp_processor_id();
int r;

if (cpumask_test_cpu(cpu, cpus_hardware_enabled))
return;

cpumask_set_cpu(cpu, cpus_hardware_enabled);

r = kvm_arch_hardware_enable();

if (r) {
cpumask_clear_cpu(cpu, cpus_hardware_enabled);
atomic_inc(&hardware_enable_failed);
pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
}
}

Upon error hardware_enable_failed is incremented. However this variable
is checked only in hardware_enable_all() called when the 1st VM is called.

This implies that KVM may be left in a state where it doesn't know a CPU
not ready to host VMX operations.

Then I'm curious what will happen if a vCPU is scheduled to this CPU. Does
KVM indirectly catch it (e.g. vmenter fail) and return a deterministic error
to Qemu at some point or may it lead to undefined behavior? And is there
any method to prevent vCPU thread from being scheduled to the CPU?

We found this open when considering TDX and CPU hotplug.

By design the current generation of TDX doesn't support CPU hotplug.
Only boot-time CPUs can be initialized for TDX (and must be done en
masse in one breath). Attempting to do seamcalls on a hotplugged CPU
simply fails, thus it potentially affects any trusted domain in case its
vCPUs are scheduled to the plugged CPU.

There is a puzzle whether we should just document such restriction or
need more proactive measure (e.g. to prevent such case happen). Since
it's similar to above situation where KVM fails to init on a hotplugged
CPU, we'd like to seek your suggestion first.

Thanks
Kevin