On Mon, Jan 17, 2022, Zeng Guang wrote:
On 1/15/2022 12:18 AM, Sean Christopherson wrote:Gah, I conflated KVM_CAP_MAX_VCPUS and KVM_MAX_VCPU_IDS. But the underlying idea
Userspace can simply do KVM_CREATE_VCPU until it hits KVM_MAX_VCPU_IDS...IIUC, what you proposed is to use max_vcpus in kvm for x86 arch (currently
not present yet) and
provide new api for userspace to notify kvm how many vcpus in current vm
session prior to vCPU creation.
Thus IPIv can setup PID-table with this information in one shot.
I'm thinking this may have several things uncertain:
1. cannot identify the exact max APIC ID corresponding to max vcpus
APIC ID definition is platform dependent. A large APIC ID could be assigned
to one vCPU in theory even running with
small max_vcpus. We cannot figure out max APIC ID supported mapping to
max_vcpus.
still works: extend KVM_MAX_VCPU_IDS to allow userspace to lower the max allowed
vCPU ID to reduce the memory footprint of densely "packed" and/or small VMs.
Agree. This is the purpose to implement this patch. With current solution we proposed, IPIv just2. cannot optimize the memory consumption on PID table to the least atThat's a feature. E.g. if userspace defines a max vCPU ID that is larger than
run-time
In case "-smp=small_n,maxcpus=large_N", kvm has to allocate memory to
accommodate large_N vcpus at the
beginning no matter whether all maxcpus will run.
what is required at boot, e.g. to hotplug vCPUs, then consuming a few extra pages
of memory to ensure that IPIv will be supported for hotplugged vCPUs is very
desirable behavior. Observing poor performance on hotplugged vCPUs because the
host was under memory pressure is far worse.
And the goal isn't to achieve the smallest memory footprint possible, it's to
avoid allocating 32kb of memory when userspace wants to run a VM with only a
handful of vCPUs, i.e. when 4kb will suffice. Consuming 32kb of memory for a VM
with hundreds of vCPUs is a non-issue, e.g. it's highly unlikely to be running
multiple such VMs on a single host, and such hosts will likely have hundreds of
gb of RAM. Conversely, hosts running run small VMs will likely run tens or hudreds
of small VMs, e.g. for container scenarios, in which case reducing the per-VM memory
footprint is much more valuable and also easier to achieve.
3. Potential backward-compatible problemThat's totally fine. This is purely a memory optimization, IPIv will still work
If running with old QEMU version, kvm cannot get expected information so as
to make a fallback to use
KVM_MAX_VCPU_IDS by default. It's feasible but not benefit on memory
optimization for PID table.
as intended if usersepace doesn't lower the max vCPU ID, it'll just consume a bit
more memory.