Re: [PATCH] KVM/x86: Increase max vcpu number to 352

From: Lan Tianyu
Date: Tue Aug 15 2017 - 23:10:54 EST


On 2017å08æ15æ 22:10, Konrad Rzeszutek Wilk wrote:
> On Tue, Aug 15, 2017 at 11:00:04AM +0800, Lan Tianyu wrote:
>> On 2017å08æ12æ 03:35, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Aug 11, 2017 at 03:00:20PM +0200, Radim KrÄmÃÅ wrote:
>>>> 2017-08-11 10:11+0200, David Hildenbrand:
>>>>> On 11.08.2017 09:49, Lan Tianyu wrote:
>>>>>> Hi Konrad:
>>>>>> Thanks for your review.
>>>>>>
>>>>>> On 2017å08æ11æ 01:50, Konrad Rzeszutek Wilk wrote:
>>>>>>> On Thu, Aug 10, 2017 at 06:00:59PM +0800, Lan Tianyu wrote:
>>>>>>>> Intel Xeon phi chip will support 352 logical threads. For HPC usage
>>>>>>>> case, it will create a huge VM with vcpu number as same as host cpus. This
>>>>>>>> patch is to increase max vcpu number to 352.
>>>>>>>
>>>>>>> Why not 1024 or 4096?
>>>>>>
>>>>>> This is on demand. We can set a higher number since KVM already has
>>>>>> x2apic and vIOMMU interrupt remapping support.
>>>>>>
>>>>>>>
>>>>>>> Are there any issues with increasing the value from 288 to 352 right now?
>>>>>>
>>>>>> No found.
>>>>
>>>> Yeah, the only issue until around 2^20 (when we reach the maximum of
>>>> logical x2APIC addressing) should be the size of per-VM arrays when only
>>>> few VCPUs are going to be used.
>>>
>>> Migration with 352 CPUs all being busy dirtying memory and also poking
>>> at various I/O ports (say all of them dirtying the VGA) is no problem?
>>
>> This depends on what kind of workload is running during migration. I
>> think this may affect service down time since there maybe a lot of dirty
>> memory data to transfer after stopping vcpus. This also depends on how
>> user sets "migrate_set_downtime" for qemu. But I think increasing vcpus
>> will break migration function.
>
> OK, so let me take a step back.
>
> I see this nice 'supported' CPU count that is exposed in kvm module.
>
> Then there is QEMU throwing out a warning if you crank up the CPU count
> above that number.
>
> Red Hat's web-pages talk about CPU count as well.
>
> And I am assuming all of those are around what has been tested and
> what has shown to work. And one of those test-cases surely must
> be migration.
>

Sorry. This is a typo. I originally meant increasing vcpu shouldn't
break migration function and just affect service downtime. If there was
such issue, we should fix it.


> Ergo, if the vCPU count increase will break migration, then it is
> a regression.
>
> Or a fix/work needs to be done to support a higher CPU count for
> migrating?
>
>
> Is my understanding incorrect?

You are right.

>
>>
>>>
>>>
>>>>
>>>>>>> Also perhaps this should be made in an Kconfig entry?
>>>>>>
>>>>>> That will be anther option but I find different platforms will define
>>>>>> different MAX_VCPU. If we introduce a generic Kconfig entry, different
>>>>>> platforms should have different range.
>>>
>>>
>>> By different platforms you mean q35 vs the older one, and such?
>>
>> I meant x86, arm, sparc and other vendors' code define different max
>> vcpu number.
>
> Right, and?

If we introduce a general kconfig of max vcpus for all vendors, it
should have different max vcpu range for different vendor.




--
Best regards
Tianyu Lan