Re: [RFC PATCH v2 0/3] Support CPU hotplug for ARM64

From: Marc Zyngier
Date: Tue Jul 16 2019 - 04:32:50 EST


Hi Jia,

On 16/07/2019 08:59, Jia He wrote:
> Hi Marc
>
> On 2019/7/10 17:15, Marc Zyngier wrote:
>> On 09/07/2019 20:06, Maran Wilson wrote:
>>> On 7/5/2019 3:12 AM, James Morse wrote:
>>>> Hi guys,
>>>>
>>>> (CC: +kvmarm list)
>>>>
>>>> On 29/06/2019 03:42, Xiongfeng Wang wrote:
>>>>> This patchset mark all the GICC node in MADT as possible CPUs even though it
>>>>> is disabled. But only those enabled GICC node are marked as present CPUs.
>>>>> So that kernel will initialize some CPU related data structure in advance before
>>>>> the CPU is actually hot added into the system. This patchset also implement
>>>>> 'acpi_(un)map_cpu()' and 'arch_(un)register_cpu()' for ARM64. These functions are
>>>>> needed to enable CPU hotplug.
>>>>>
>>>>> To support CPU hotplug, we need to add all the possible GICC node in MADT
>>>>> including those CPUs that are not present but may be hot added later. Those
>>>>> CPUs are marked as disabled in GICC nodes.
>>>> ... what do you need this for?
>>>>
>>>> (The term cpu-hotplug in the arm world almost never means hot-adding a new package/die to
>>>> the platform, we usually mean taking CPUs online/offline for power management. e.g.
>>>> cpuhp_offline_cpu_device())
>>>>
>>>> It looks like you're adding support for hot-adding a new package/die to the platform ...
>>>> but only for virtualisation.
>>>>
>>>> I don't see why this is needed for virtualisation. The in-kernel irqchip needs to know
>>>> these vcpu exist before you can enter the guest for the first time. You can't create them
>>>> late. At best you're saving the host scheduling a vcpu that is offline. Is this really a
>>>> problem?
>>>>
>>>> If we moved PSCI support to user-space, you could avoid creating host vcpu threads until
>>>> the guest brings the vcpu online, which would solve that problem, and save the host
>>>> resources for the thread too. (and its acpi/dt agnostic)
>>>>
>>>> I don't see the difference here between booting the guest with 'maxcpus=1', and bringing
>>>> the vcpu online later. The only real difference seems to be moving the can-be-online
>>>> policy into the hypervisor/VMM...
>>> Isn't that an important distinction from a cloud service provider's
>>> perspective?
>>>
>>> As far as I understand it, you also need CPU hotplug capabilities to
>>> support things like Kata runtime under Kubernetes. i.e. when
>>> implementing your containers in the form of light weight VMs for the
>>> additional security ... and the orchestration layer cannot determine
>>> ahead of time how much CPU/memory resources are going to be needed to
>>> run the pod(s).
>> Why would it be any different? You can pre-allocate your vcpus, leave
>> them parked until some external agent decides to signal the container
>> that it it can use another bunch of CPUs. At that point, the container
>> must actively boot these vcpus (they aren't going to come up by magic).
>>
>> Given that you must have sized your virtual platform to deal with the
>> maximum set of resources you anticipate (think of the GIC
>> redistributors, for example), I really wonder what you gain here.
> I agree with your point in GIC aspect. It will mess up things if it makes
>
> GIC resource hotpluggable in qemu.

It is far worse than just a mess. You'd need to come up with a way to
place your redistributors in memory, and tell the running guest where
these redistributors are. Currently, there is no method to describe such
changes to the address space, and I certainly don't want QEMU to invent
one. This needs to be modeled after what would happen on real HW.

> But it also would be better that vmm
>
> only startup limited vcpu thread resource.
>
> How about:
>
> 1. qemu only starts only N vcpu thread (-smp N, maxcpus=M)
>
> 2. qemu reserves the GIC resource with maxium M vcpu number

Note that this implies actually initializing M vcpus in the VM. You may
not have created the corresponding (M - N) threads, but the vcpus will
exist. Can you please quantify how much you'd save by doing that?

> 3. when qmp cmd cpu hotplug-add is triggerred, send a GED event to guest kernel
>
> 4. guest kernel recv it and trigger the acpi plug process.
>
> Currently ACPI_CPU_HOTPLUG is enabled for Kconfig but completely not workable.

Well, there so far *zero* CPU_HOTPLUG in the arm64 kernel other than
getting CPUs in and out of PSCI.

Thanks,

M.
--
Jazz is not dead. It just smells funny...