On Fri, Jan 14, 2022, Zeng Guang wrote:IIUC, what you proposed is to use max_vcpus in kvm for x86 arch (currently not present yet) and
On 1/14/2022 6:09 AM, Sean Christopherson wrote:No, I do think it's safe, but it is still terrifying :-)
On Fri, Dec 31, 2021, Zeng Guang wrote:Free old PID table here is safe as kvm making request KVM_REQ_PI_TABLE_UPDATE
+static int vmx_expand_pid_table(struct kvm_vmx *kvm_vmx, int entry_idx)This is terrifying. I think it's safe? But it's still terrifying.
+{
+ u64 *last_pid_table;
+ int last_table_size, new_order;
+
+ if (entry_idx <= kvm_vmx->pid_last_index)
+ return 0;
+
+ last_pid_table = kvm_vmx->pid_table;
+ last_table_size = table_index_to_size(kvm_vmx->pid_last_index + 1);
+ new_order = get_order(table_index_to_size(entry_idx + 1));
+
+ if (vmx_alloc_pid_table(kvm_vmx, new_order))
+ return -ENOMEM;
+
+ memcpy(kvm_vmx->pid_table, last_pid_table, last_table_size);
+ kvm_make_all_cpus_request(&kvm_vmx->kvm, KVM_REQ_PID_TABLE_UPDATE);
+
+ /* Now old PID table can be freed safely as no vCPU is using it. */
+ free_pages((unsigned long)last_pid_table, get_order(last_table_size));
with KVM_REQUEST_WAIT flag force all vcpus trigger vm-exit to update vmcs
field to new allocated PID table. At this time, it makes sure old PID table
not referenced by any vcpu.
Do you mean it still has potential problem?
That's why we have cgroups, rlimits, etc...Rather than dynamically react as vCPUs are created, what about we make max_vcpusIIUC, it's risky if relying on userspace .
common[*], extend KVM_CAP_MAX_VCPUS to allow userspace to override max_vcpus,
and then have the IPIv support allocate the PID table on first vCPU creation
instead of in vmx_vm_init()?
That will give userspace an opportunity to lower max_vcpus to reduce memory
consumption without needing to dynamically muck with the table in KVM. Then
this entire patch goes away.
In this way userspace also have chance to assign large max_vcpus but not useUserspace can simply do KVM_CREATE_VCPU until it hits KVM_MAX_VCPU_IDS...
them at all. This cannot approach the goal to save memory as much as possible
just similar as using KVM_MAX_VCPU_IDS to allocate PID table.