On 10/21/21 1:03 PM, Wanpeng Li wrote:Hi, Wanpeng & Sean
On Thu, 21 Oct 2021 at 11:05, zhenwei pi <pizhenwei@xxxxxxxxxxxxx> wrote:
On 10/21/21 4:12 AM, Sean Christopherson wrote:
On Wed, Oct 20, 2021, Wanpeng Li wrote:
On Wed, 20 Oct 2021 at 20:08, zhenwei pi <pizhenwei@xxxxxxxxxxxxx> wrote:
Although host side exposes KVM PV SEND IPI feature to guest side,
guest should still have a chance to disable it.
A typicall case of this parameter:
If the host AMD server enables AVIC feature, the flat mode of APIC
get better performance in the guest.
Hmm, I didn't find enough valuable information in your posting. We
observe AMD a lot before.
https://lore.kernel.org/all/CANRm+Cx597FNRUCyVz1D=B6Vs2GX3Sw57X7Muk+yMpi_hb+v1w@xxxxxxxxxxxxxx/T/#u
I too would like to see numbers. I suspect the answer is going to be that
AVIC performs poorly in CPU overcommit scenarios because of the cost of managing
the tables and handling "failed delivery" exits, but that AVIC does quite well
when vCPUs are pinned 1:1 and IPIs rarely require an exit to the host.
Test env:
CPU: AMD EPYC 7642 48-Core Processor
Kmod args(enable avic and disable nested):
modprobe kvm-amd nested=0 avic=1 npt=1
QEMU args(disable x2apic):
... -cpu host,x2apic=off ...
Benchmark tool:
https://github.com/bytedance/kvm-utils/tree/master/microbenchmark/apic-ipi
~# insmod apic_ipi.ko options=5 && dmesg -c
apic_ipi: 1 NUMA node(s)
apic_ipi: apic [flat]
apic_ipi: apic->send_IPI[default_send_IPI_single+0x0/0x40]
apic_ipi: apic->send_IPI_mask[kvm_send_ipi_mask+0x0/0x10]
apic_ipi: IPI[kvm_send_ipi_mask] from CPU[0] to CPU[1]
apic_ipi: total cycles 375671259, avg 3756
apic_ipi: IPI[flat_send_IPI_mask] from CPU[0] to CPU[1]
apic_ipi: total cycles 221961822, avg 2219
apic->send_IPI_mask[kvm_send_ipi_mask+0x0/0x10]
-> This line show current send_IPI_mask is kvm_send_ipi_mask(because
of PV SEND IPI FEATURE)
apic_ipi: IPI[kvm_send_ipi_mask] from CPU[0] to CPU[1]
apic_ipi: total cycles 375671259, avg 3756
-->These lines show the average cycles of each kvm_send_ipi_mask: 3756
apic_ipi: IPI[flat_send_IPI_mask] from CPU[0] to CPU[1]
apic_ipi: total cycles 221961822, avg 2219
-->These lines show the average cycles of each flat_send_IPI_mask: 2219
Just single target IPI is not eough.
Wanpeng
Benchmark smp_call_function_single (https://github.com/bytedance/kvm-utils/blob/master/microbenchmark/ipi-bench/ipi_bench.c):
Test env:
CPU: AMD EPYC 7642 48-Core Processor
Kmod args(enable avic and disable nested):
modprobe kvm-amd nested=0 avic=1 npt=1
QEMU args(disable x2apic):
... -cpu host,x2apic=off ...
1> without no-kvm-pvipi:
ipi_bench_single wait[1], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
elapsed = 424945631 cycles, average = 4249 cycles
ipitime = 385246136 cycles, average = 3852 cycles
ipi_bench_single wait[0], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
elapsed = 419057953 cycles, average = 4190 cycles
2> with no-kvm-pvipi:
ipi_bench_single wait[1], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
elapsed = 321756407 cycles, average = 3217 cycles
ipitime = 299433550 cycles, average = 2994 cycles
ipi_bench_single wait[0], CPU0[NODE0] -> CPU1[NODE0], loop = 100000
elapsed = 295382146 cycles, average = 2953 cycles