[snip]
Here is the microbenchmarck data with perf bench futex wake case on 3C5000
single-way machine, there are 16 cpus on 3C5000 single-way machine, VM
has 16 vcpus also. The benchmark data is ms time unit to wakeup 16 threads,
the performance is higher if data is smaller.
perf bench futex wake, Wokeup 16 of 16 threads in ms
--physical machine-- --VM original-- --VM with pv ipi patch--
0.0176 ms 0.1140 ms 0.0481 ms
---
Change in V4:
1. Modfiy pv ipi hook function name call_func_ipi() and
call_func_single_ipi() with send_ipi_mask()/send_ipi_single(), since pv
ipi is used for both remote function call and reschedule notification.
2. Refresh changelog.
Change in V3:
1. Add 128 vcpu ipi multicast support like x86
2. Change cpucfg base address from 0x10000000 to 0x40000000, in order
to avoid confliction with future hw usage
3. Adjust patch order in this patchset, move patch
Refine-ipi-ops-on-LoongArch-platform to the first one.